### Two Efficient Methods to Reduce Power and Testing Time

II-Soo Lee Jae Hoon Jeong Tony Ambler Dept. of Electrical and Computer Engineering, University of Texas at Austin ilee, jeong, ambler@ece.utexas.edu

#### Abstract

Reducing power dissipation and testing time is accomplished by forming two clusters of don't-care bit inside an input and a response test cube. New reordering scheme of scan latches is proposed to create the clusters of don't-care bit, and two proposed reconfigured scan architecture guarantee to remove the clusters from the scan operation. The size of these clusters is directly proportional to the amount of power and testing time that is reduced. Results with ISCAS'89 benchmark circuits show good improvement in both power consumption and test time.

#### **Categories and Subject Descriptors**

B.8.1 [Hardware]: Reliability, Testing, and Fault-Tolerance

#### **General Terms**

Design, Reliability

#### Keywords

Reordering Scan Latches, Scan Architecture, Power, Testing Time

#### 1. Introduction

The Integrated chip (IC) is a critical component in modern electronic devices. Now, more functions are integrated into a single IC than ever as technology is being developed. It brings out many problems while making circuits even denser. One of the problems is power. Power dissipation is becoming a big concern since excessive power can cause severe damage to an IC. In particular, power dissipation in testing is even more serious since it is much greater than that in the normal operation[1]. Another issue is test time. Testing time increases as the circuit under test (CUT) becomes more complicated and integrated. Furthermore, when system-on-chip (SOC) is tested, it requires much testing time. Hence, SOC usually are being tested concurrently to reduce testing time. However, many restrictions like limited number of pins limit the concurrent testing. Thus many researchers are finding ways to reduce testing time.

R. Gupta et al. in [2] introduced an algorithm that provided an optimal ordering of scan latches in a single scan chain such that testing time was reduced without considering scan routing. S. Narayanan et al. in [3] proposed using multiple scan chains to reduce testing time such a way that the scan elements that were more frequently accessed were inserted in a shorter scan chain. D. Ghosh et al. in [4] proposed that the reducing power dissipation and testing time was achieved by partitioning a scan chain into multiple

ISLPED'05, August 8-10, 2005, San Diego, California, USA.

Copyright 2005 ACM 1-59593-137-6/05/0008...\$5.00.

scan chains and by reordering scan latches. The work in this paper uses the two above techniques and adds two new methods, i.e., the new proposed reordering and the scheme of not using the cluster of don't-care bit in the scan operation. These two new methods give an additional amount of reduction. L. Whetsel et al. in [5] introduced the new clock scheme and the reconfigured scan architecture to decrease power consumption due to the transitions caused by scan shifting of test data. The work in this paper uses the same architecture as in [5]. However, it produces a higher reduction in both power dissipation and testing time. P. Girard et al. in [6] presented that the number of transitions in the linear feedback shift register (LFSR) is reduced by new clock scheme. S. Samaranayake et al. in [7] introduced the dynamic scan that reduced the volume of test sets and test application time by taking advantage of don't-care bit in test sets. The scheme in [7] used almost the same technique as that in this paper in terms of making use of don't-care bit. However, the overhead of circuits in [7] to keep control signals was huge. I. Hamzaoglu et al. in [8] proposed a reconfigurable scan architecture that used the parallel test mode of scan chains for most of the faults and then used the serial test mode of scan chains for the rest of the faults. O. Aerts et al. in [9] presented an approach that various scan architectures with 3 scan chains were introduced and analyzed in terms of decreasing the size of test sets. R. Sankaralingam et al. in [10] proposed that the clocks to some scan chains are disable for some portions of test sets. It reduced the switching activities in both scan chains and CUT. Sinanoglu et al. in [11] and I. Lee et al. in [12] dealt with two issues, i.e., power dissipation and testing time, simultaneously.

The proposed methods in this paper start with a proposed reordering scheme that an input and a response test cube are rearranged together in the descending (or ascending) order of number of don't-care bits in their columns to create two clusters of don't-care bit inside them. These two clusters should be the same in size and shape and are not employed in the scan operation of two newly reconfigured scan architectures, which are designed to remove these two clusters from the scan operation. Thus the amount of power and testing time that is reduced is directly proportional to the size of the clusters. Furthermore, these two proposed reconfigured scan architectures do not need any big extra circuit for control, but need as much as in [5] or a little more than the conventional scan architecture. Note that the routing issue incurred by reordering scan latches is not considered in these proposed methods.

The remainder of this paper is organized as follows. Section 2 discusses a proposed reordering scheme. Section 3 introduces two new reconfigured scan architectures. Section 4 deals with how the reordered test sets with the clusters and two proposed scan architectures are interacted to reduce power and testing time in the scan operation. Finally, the last section shows the results obtained by simulating with the ISCAS'89 benchmark circuits.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

#### 2. Proposed Reordering of Scan Latches

The basic idea of the proposed reordering scheme is to put don'tcare bits together by rearranging a test set (input and response test cube) in the descending (or ascending) order of the number of don't-care bits in its column. Then, the shape and the size of the clusters are determined depending on the way that don't-care bits in the test set are grouped. The clusters should consist of only don'tcare bits without any specified bit as in Figure 1(b). Once the clusters are set, they will be omitted from the scan operation of two proposed scan architectures.



Figure 1. Input test cube and response test cube before & after applying the proposed reordering scheme.

Suppose that Figure 1(a) is a deterministic input test cube and its response test cube. The "number of don't-care bits in the column of test set" at the top of Figure 1 means the total number of don't-care bits in both the column of the input test cube and its corresponding column of the response test cube. For instance, the last column of the test set in Figure 1(a) has 9 don't-care bits where 5 of them come from the input test cube and the rest of them come from the response test cube, which are shown lightly dark in Figure 1(a). Figure 1(b) depicts the test set after applying the proposed reordering scheme that creates the cluster in both the left-bottom part of the input test cube and the response test cube, respectively. These two clusters of don't-care, which are shown relatively darker, are the same in both size and shape and are always shaped as a stair as in Figure 1(b). They have 3 steps including ground level. 3 steps indicate that the input and the response test cube will be either rowwisely or column-wisely divided into 3 smaller input and response test cubes as in Figure 1(b). These 3 smaller input and response test cubes mean that the proposed scan architectures will have 3 partitioned scan chains. Basically, the greater the number of steps in the clusters is, the bigger the size of the clusters is.

The reordering scheme of scan latches is known to be a NP-hard problem [13], so the heuristic method is employed here. Before explaining details of the proposed heuristic method, note that if any column in the input test cube changes its position, its corresponding column in the response test cube should move to the same position where the column in the input test cube moves. For instance, when the first column of the input test cube moves to the fourth column position, the first column of the response test cube, which is corresponding to the first column of response test cube, should travel to its fourth column in Figure 1. Their moves are always stuck together. Here are the detailed and ordered steps of the proposed heuristic method.

**Step 1**: • Rearrange the columns of a test set in the descending (or ascending) order of the number of don't-care bits in the column of the test set. Go to **Step 2**.

**Step 2:** • If the test set is rearranged in the descending (or ascending) order of the number of don't-care bits in the column, the left-bottom (or the right-bottom) part of the test set is usually reserved for the clusters of don't-care bit.

• Reserve the left-bottom (or the right-bottom) part of the test set for the cluster of don't-care bit. Its size is set as big as possible at the beginning, and its shape should be a down-stair (or an up-stair). Advance to **Step 3**.

**Step 3:** • Find any specified bit in the reserved part for the clusters. If it is found, go through the reordering process to replace the column that has the specified bit with the column outside the reserved part that has only don't-care bits for the reserved part.

• If it is impossible to remove any specified bit from the reserved part by reordering, repeat **Step 3** with reserving a smaller part for the clusters with the new number of steps that is 1 step less than the previous number of steps.

• If any specified bit are not found in the reserved part, proceed to **Step 4**.

**<u>Step 4</u>**: • The reserved part becomes the cluster of don't-care bit. Go to **Step 5**.

**<u>Step 5</u>:** • Reorder scan latches for fewer transitions inside each smaller column-wise divided test set.

• Note that the required number of steps of a stair-shaped cluster (the number of smaller divided test sets) is determined by how many scan chains are employed in the proposed reconfigured scan architectures. The number of scan chains in the proposed scan architectures is determined by user's decision. In this simulation, the initial number of steps is set as 5 randomly, and then the number of steps decreases down to 2 steps by 1 step with a smaller part reserved for the clusters if the clusters cannot be created.

If defining and removing two clusters of don't-care bit inside test sets fails, the work in this paper would not work. However, failure is not likely to happen since most of test sets usually consist of more don't-care bits as they travel down to their bottom. Thus it is impossible for failure to take place in obtaining the clusters of don't-care bit. At least, small-sized clusters can be obtained. As for the system of multiple scan chains, the clusters of don't-care with the same size inside the test sets for multiple scan chains can be easily created since the test set for each scan chain consists of a plenty of don't-care bits. The next section introduces two new reconfigured scan architectures that make the cluster of don't-care bit out of the scan operation.

#### 3. Proposed Reconfigured Scan Architectures

This section introduces two new reconfigured scan architectures. The first proposed scan architecture is composed of multiple partitioned scan chains and buffers at the end of each partitioned scan chain in Figure 2. These partitioned scan chains share one scan-in and one scan-out. This architecture is referred to as the "multiple partitioned scan architecture". Inserting MUXes in the middle of a scan chain creates the second proposed scan architecture in Figure 3. These

MUXes have two inputs. One input is directly connected to the output of scan latch in front of each MUX and the other input is connected to the input of scan chain. This is referred to as the "MUXed scan architecture".

#### 3.1 Multiple Partitioned Scan Architecture

Figure 2 presents the proposed multiple partitioned scan architecture. Each partitioned scan chain has one buffer at its end that is designed to block the scan-out of test responses in the scan chain depends on the size of column of each column-wisely divided part. For instance, all 3 column-wisely divided test sets in Figure 1(b) have 2 columns, respectively, so each partitioned scan chain will have 2 scan latches. The controller is used to count or control the shifting cycle, the capture cycle, the buffers and so on.





Unlike the clock scheme in [5], a modified clock scheme from [5] is employed here to run the proposed scan operation. As mentioned above, the test set in Figure 1(b) can be row-wisely divided into 3 smaller parts based on the clusters. When the first part of the test set is in the scan operation, the clock operates in such a way that only the first scan chain (Scan Chain A) shifts while the other two scan chains (Scan Chain B and C) stay calm during first 2 clocks. Then, only the second scan chain shifts for next 2 clocks while the other two scan chains rest. After all scan chains go through shifting during 2 clocks individually, one clock is ticked for every scan chain to capture test response. For the second part of the test set, only first two scan chains are involved in the scan operation in the same way as that in the first part. The last scan chain is permanently disabled for shifting since it will have only don't-care bits until the end of scan operation. For the last part, only the first scan chain is involved in the scan operation while the last two scan chains are permanently disabled for shifting. These disabled scan chains for shifting mainly contribute to reducing power dissipation and testing time. The detailed scan operation will follow in the next section.

#### **3.2 MUXed Scan Architecture**

Figure 3 depicts the proposed MUXed scan architecture. MUXes are placed in the middle of a scan chain and the number of required MUXes is 'number of steps in the clusters -1'. The position of MUXes in the scan chain is determined by the size of column of each column-wisely divided test set. In case that all 3 column-wisely divided test sets have 2 columns as in Figure 1(b), 2 MUXes are placed right after the second and fourth scan latches, respectively.



Figure 3. MUXed scan architecture.

The clock scheme for this scan architecture is a little different from that in the multiple partitioned scan architecture. It is closer to the conventional clock scheme. For the first part of the test set in Figure 1(b), the clocks to every scan chain work for shifting as the conventional clock scheme. For the second part, the clocks to the first scan chain from the scan-in input position (Scan Chain C) are disabled for shifting while the clocks work for shifting of only the second and third scan chains (Scan Chain B and A). So the first MUX in Figure 3 takes input test data directly from the scan-in position of scan chain since the bits in the first scan chain (Scan Chain C) will be don't-care bit until the end of scan operation. For the last part, only the last scan chain (Scan Chain A) is involved in the shifting process using the second MUX to take input test data directly from the scan-in position. These disabled scan chains for shifting mainly contribute to reducing power dissipation and testing time as in the multiple partitioned scan architecture. Note that when a long scan chain is partitioned into multiple scan chains, the number of scan latches in all partitioned scan chains should be close to each other. This is because it leads to less control circuit and is easily manageable.

#### 4. Interaction of Proposed Reordering and Two Reconfigured Scan Architectures



Figure 4. Test set after applying the proposed reordering (an example).

According to the test set in Figure 4 that is a copy of the test set in Figure 1(b), the proposed scan operations for two proposed scan architectures are separated into 3 phases. All partitioned scan chains in two proposed scan architecture have 2 scan latches. The phases in two proposed scan operations correspond to the row-wisely smaller divided test sets in Figure 4. Note that the rightmost column of the input test cube in Figure 4 scans in first and "Test Response  $1 \sim 6$ " are the test response of "Input Pattern  $1 \sim 6$ ", respectively.

# 4.1 Scan Operation of Multiple Partitioned Scan Architecture

In Phase 1, the multiple partitioned scan architecture operates in the same way as that in [5]. During first 2 clocks, Scan Chain A works for shifting while Scan Chain B and C stay calm and their buffers and

clocks are disabled. Next 2 clocks work for shifting of only Scan Chain B, and then another 2 clocks work for shifting of only Scan Chain C. Like this way, total 14 clocks are consumed in Phase 1. Attention should be paid at the point where Phase 1 shifts to Phase 2, i.e., after capturing "Test Response 2".



Now every partitioned scan chain has "Test Response 2" in its scan latches. All bits in "Test Response 2" should be scanned out since they all are valuable for the analysis of test response. So the first input pattern in Phase 2 (Input Pattern 3) scans in the same way that the input patterns in Phase 1 do although it belongs to Phase 2. So scanning-in of "Input Pattern 3" consumes 6 clocks, and then next 1 clock is ticked to catch "Test Response 3".



Figure 6. Phase 3 (after the 1st clock of scan-in of "Input Pattern 6").

From "Test Response 3", only the bits in Scan Chain A and B are useful for the analysis of test response in Phase 2. Hence, Scan Chain C is permanently disabled for shifting. Once the don't-care bits in the clusters are captured as a part of a test response in the last scan chain, they are reinserted into CUT as a part of an input pattern and then are recaptured as a part of the next test response. This process continues until the end of scan operation. Hence, it is all right to scan in only 4 bits of "Input Pattern 4" without 2 don't-care bits in the clusters. Figure 5 shows the state of scan chains right after the third clock of scan-in of "Input Pattern 4" in Phase 2. Capture clock takes place after every 4 clocks in Phase 2. After the capture of "Test Response 4", another attention should be paid at the boundary (Phase 2 and 3) due to the same reason as between Phase 1 and 2. The 4 bits of "Test Response 4" in Scan Chain A and B should be scanned out for the analysis of test response. So "Input Pattern 5" in Phase 3 should scan in the same way that the input patterns in Phase 2 do. Phase 2 spends 12 clocks totally.

In Phase 3, Scan Chain B and C are permanently disabled for shifting since they have only don't-care bits. Capture clock takes place after every 2 clocks, and total 10 clocks are spent. Figure 6 depicts the state of scan chains right after first clock of scan-in of "Input Pattern 6". If the test set in Figure 4 experiences the conventional scan operation, it spends total 48 clocks (= (6 \* 6) + 6

+ 6). But the proposed scan operation uses 36 clocks. 12 clocks of testing time are saved.

## 4.2 Scan Operation of MUXed Scan Architecture

The scan operation in the MUXed scan architecture is almost the same as that in the multiple partitioned scan architecture. In Phase 1, the clock operates in the conventional way. Two MUXes take input test data from the scan latches ahead of them during shifting. So, Phase 1 spends total 14 clocks. Caution should be made at the point where Phase 1 shifts to Phase 2, i.e., after capturing "Test Response 2".



Figure 7. Phase 2 (after the 3rd clock of scan-in of "Input Pattern 4").

Now every partitioned scan chain has "Test Response 2" in its scan latches. All bits in "Test Response 2" should be scanned out for the analysis of test response. So the first input pattern in Phase 2 (Input Pattern 3) scans in the same way that the input patterns in Phase 1 do even though it belongs to Phase 2. In Phase 2, Scan Chain C is permanently disabled for shifting since it has don't-care bits until the end of scan operation.



Figure 8. Phase 3 (after the 1st clock of scan-in of "Input Pattern 6").

Phase 2 uses only Scan Chain B and A for shifting and continues until capturing "Test Response 4". Figure 7 shows the state of scan chains right after the third clock of scan-in of "Input Pattern 4" in Phase 2, where the arrowed solid line indicates the flow of input test data. Another caution should be made at the boundary of Phase 2 and 3 due to the reason mentioned above. Total 12 clocks are spent for Phase 2. In Phase 3, Scan Chain B and C are permanently disabled for shifting, and 10 clocks are spent totally. Figure 8 depicts the state of scan chains right after the first clock cycle of scan-in of "Input Pattern 6" in Phase 3. Like the previous scan architecture, this scan architecture consumes 36 clocks for testing time.

As for the number of transitions in the shifting process, if a long scan chain architecture is employed without any reordering of scan latches, the number of transitions from the test set in Figure 1(a) is significantly greater than those obtained by using two proposed scan architectures with the test set in Figure 4 since two proposed scan architectures reduce the switching transitions by using multiple scan chains. As mentioned above, the first input pattern in a phase should follow the same scan operation as the input patterns in its previous phase. Generally, the impact of the above effect gets less significant as test sets get bigger. Therefore, the reduction rate in power dissipation and testing time on real circuits gets greater than those obtained in this section. And the size of the circuits required to control the proposed clock schemes are either almost the same as in [5] or a little higher than the conventional scan architecture. There are some assumptions to take in implementing the proposed reordering of scan latches. First, most of the circuits use multiple clocks in their system. For two scan latches that exist in two different clock domains, swapping these two scan latches can create many problems like clock skew. Hence this swapping is not allowed in the proposed reordering scheme. Another is that switching scan latches between two different modules can cause long routing and routing congestion. Hence the reordering scheme in this paper is assumed to take place inside a single module. As mentioned before, the work in this paper does not consider the routing problem caused by reordering scan latches and using any physical information of placement and routing [15, 16].

#### 5. Experimental Results

In the simulation of this paper, all ISCAS'89 benchmark circuits are assumed to use one clock system as a single module. The initial number of steps of the stair-shaped clusters is set 5 randomly and then the number of steps decreases down to 2 steps by 1 step if the clusters of don't-care bit cannot be created with that number of step. Thus the simulation starts with dividing a long scan chain into 5 smaller scan chains in the multiple partitioned scan architecture, and inserting 4 MUXes among partitioned scan chains in the MUXed scan architecture. This also means that one test set is divided into 5 smaller divided test sets in either row-wise or column-wise. Using fewer partitioned scan chains or fewer MUXes mostly means that the amount of reduction of power dissipation and testing time is likely low.

Table 1 shows the results of various ISCAS'89 benchmark circuits in terms of power dissipation implementing two proposed scan architectures and two other schemes. The dissipated amount of power is obtained based on the Weighted Transition Metric (WTM) in [14]. WTM is designed to approximately compute the amount of power dissipation incurred by scan shifting of test data. This model is good enough to compare two schemes in terms of power dissipation. Before applying WTM, any don't-care bit in test sets is converted to a specified bit based on the Minimum- Transition Fill (MT-fill) [14] that is designed to decrease transitions inside the test sets. Table 2 illustrates the resultant data in terms of testing time. Testing time is calculated based on the fact that the scan-in of input test data and the scan-out of test response data take place at the same time, and the scan-in of one bit, the scan-out of one bit, and the capture of test response spend only one clock, respectively.

In Table 1, the second column indicates the number of transitions based on WTM in a long scan architecture. The third column shows the results of the proposed scheme in [5] based on WTM and the improvements on a long scan architecture that ranges from 44.69% to 85.92%. The fourth columns present the results of the multiple partitioned scan architecture and the improvements on a long scan architecture that is from 48.12% to 86.65%. These results are numerically the best among the other schemes. Frankly, these results are almost the same as those in the proposed scheme in [5]. However, if the result of the multiple partitioned scan architecture for testing time in Table 2 is considered together, the multiple partitioned scan architecture is better than the scheme in [5] in terms of power dissipation and testing time.

| Table 1: Number of transitions from WTM [14] in the ISCAS'89 benchmark test sets after applying proposed |
|----------------------------------------------------------------------------------------------------------|
| reordering scheme                                                                                        |

| Bench-<br>mark<br>Circuit | A Long<br>Scan Chain<br>Arch. | The Proposed Scar<br>[5] | n Architecture In           | Proposed Multiple<br>Architecture | e Partitioned Scan          | Proposed MUXed Scan Architecture |                             |
|---------------------------|-------------------------------|--------------------------|-----------------------------|-----------------------------------|-----------------------------|----------------------------------|-----------------------------|
|                           | Weighted                      | Weighted                 | Improvements                | Weighted                          | Improvements                | Weighted                         | Improvements                |
|                           | Transition<br>Metric          | Metric                   | on A Long Scan<br>Arch. (%) | Metric                            | on A Long Scan<br>Arch. (%) | Metric                           | on A Long Scan<br>Arch. (%) |
| s5378                     | 617,588                       | 182,682(4)†              | 70.42 %                     | 179,878(4)†                       | 70.87 %                     | 382,306(4)*                      | 38.10 %                     |
| s9234                     | 781,447                       | 432,215(4)               | 44.69 %                     | 405,419(4)                        | 48.12 %                     | 623,493(4)                       | 20.21 %                     |
| s13207                    | 2,930,897                     | 412,610(5)               | 85.92 %                     | 391,295(5)                        | 86.65 %                     | 1,525,271(5)                     | 47.96 %                     |
| s15850                    | 2,543,894                     | 628,459(5)               | 75.30 %                     | 625,838(5)                        | 75.40 %                     | 1,670,975(5)                     | 34.31 %                     |
| s38417                    | 22,541,862                    | 7,370,592(4)             | 67.30 %                     | 6,808,645(4)                      | 69.80 %                     | 13,557,706(4)                    | 39.86 %                     |
| s38584                    | 19,642,919                    | 3,609,079(5)             | 81.63 %                     | 3,492,483(5)                      | 82.22 %                     | 11,142,870(5)                    | 43.27 %                     |

Table 2: Testing time (unit: clock)

|         |         |            |             | 6           |         |         |             |             |             |
|---------|---------|------------|-------------|-------------|---------|---------|-------------|-------------|-------------|
| Bench-  | A Long  | The        | Two         | Improvement | Bench-  | A Long  | The         | Two         | Improvement |
| mark    | Scan    | Proposed   | Proposed    | of Two      | mark    | Scan    | Proposed    | Proposed    | of Two      |
| Circuit | Chain   | Scan Arch. | Scan Archs. | Proposed    | Circuit | Chain   | Scan Arch.  | Scan Archs. | Proposed    |
|         | Arch    | In [5]     |             | Scan Archs. |         | Arch.   | In [5]      |             | Scan Archs. |
| s5378   | 25,799  | 25,799(4)† | 19,871(4)†  | 22.98 %     | s15850  | 74,051  | 74,051(5) † | 50,761(5) † | 31.45 %     |
| s9234   | 36,703  | 36,703(4)  | 31,135(4)   | 15.17 %     | s38417  | 159,839 | 159,839(4)  | 118,007(4)  | 26.17 %     |
| s13207  | 168,239 | 168,239(5) | 82,839(5)   | 50.76 %     | s38584  | 193,379 | 193,379(5)  | 130,667(5)  | 32.43 %     |

† the number of partition of test set (the number of steps of the clusters) for the best result

For the MUXed scan architecture, its outcome based on WTM is relatively low like 20.21% to 47.96%. When its result for testing time is taken into account together, it is highly competitive to two schemes, i.e., a long scan architecture and the scheme in [5].

In Table 2, the second and seventh columns show the testing time of a long scan architecture whereas the third and eighth columns imply the testing time of the scheme in [5]. Basically, the testing time for these two schemes is exactly the same since the scheme in [5] does not save any testing time although it uses a multiple partitioned scan architecture. Conversely, two proposed scan architectures save the same testing time. The saved amount varies from 15.17% to 50.76% depending on the benchmark circuits. The number in parenthesis in Table 1 and 2 indicates the number of steps in the clusters that grants the best result among other numbers of step. However, this does not mean that this number always guarantees the best result for the simulated benchmark circuits. Other numbers of step might create a better result. Most of the improvements of the proposed schemes stem from removing the clusters of don't-care bit from the scan operation, using multiple scan chains with one scan-in and one scan-out, and disabling the clocks to some scan chains for some periods. In fact, the last factor is not reflected in computing the results in terms of power dissipation. So if it is added in the computation, the data of two proposed schemes would get much greater in terms of power dissipation even though compared with the scheme in [5] since about 40% and greater of power dissipation are known to be consumed in the clock system. The proposed schemes in this paper have a clear advantage over a long scan chain architecture and the scheme in [5] with respect to power dissipation and testing time combined.

To conclude, the proposed schemes in this paper obviously show a considerable saving amount in terms of the reduction in power consumption and testing time. The overhead of extra circuits is small because adding the extra circuits for controlling the new scan architectures is not significant.

#### References

- Y. Zorian, "A Distributed BIST Control Scheme for Complex VLSI Devices", Proc., IEEE VLSI Test Symposium, 1993, pp. 4-9
- [2] R. Gupta, M. A. Breuer, "Ordering Storage Elements in a Single Scan Chain", Proc., IEEE/ACM International Conference on CAD, 1991, pp. 408 – 411
- [3] S. Narayanan, R. Gupta, M. A. Breuer, "Optimal Configuring of Multiple Scan Chains", Transactions, IEEE Computers, vol. 42, no. 9, Sept 1993, pp. 1121 – 1131

- [4] D. Ghosh, S. Bhunia, K Roy, "Multiple Scan Chain Design Technique for Power Reduction during Test Application in BIST", Proc., IEEE International Symposium on DFT, 2003, pp. 191 – 198
- [5] L. Whetsel, "Adapting Scan Architectures for Low Power Operation", Proc., IEEE International Test Conference, 2000, pp. 863 – 872
- [6] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, H.J. Wunderlich, "A Modified Clock Scheme for a Low Power BIST Test Pattern Generator", Proc., IEEE VLSI Test Symposium, 2001, pp. 302 – 311
- S. Samaranayake, N. Sitchinava, R. Kapur, M.B. Amin, T.W. Williams, "Dynamic Scan: Driving down the Cost of Test", Transactions, on IEEE Computer, Vol 35, Issue 10, Oct. 2002, pp. 63 – 68
- [8] I. Hamzaoglu and J.H. Patel, "*Reducing Test Application Time for Full Scan Embedded Cores*", Proc., IEEE International Symposium on Fault Tolerant Computing, 1999, pp. 260 – 267
- [9] J. Aerts and E. J. Marinissen, "Scan Chain Design for Test Time Reduction in Core-Based ICS", Proc., International Test Conference, 1998, pp. 448 – 457
- [10] R. Sankaralingam, B. Pouya, and N.A. Touba, "Reducing Power Dissipation During Testing Using Scan Chain Disable", Proc., IEEE VLSI Test Symposium, 2001, pp. 319 – 324
- [11] O. Sinanoglu and A. Orailoglu, "A Novel Scan Architecture for Power-Efficient Rapid Test", Proc., IEEE/ACM International Conference on CAD, 2002, pp. 299 – 303
- [12] I. Lee, Y. M. Hur, T. Ambler, "The Efficient Multiple Scan Chain Architecture Reducing Power Dissipation and Test Time", Proc., IEEE Asian Test Symposium, 2004, pp. 94 – 97
- [13] M. R. Gary and D. S. Johnson, "Computers and Interactivity: A Guide to the Theory of NP-completeness", Freeman, 1979
- [14] S. Ghosh, S. Basu and N. Touba. "Joint Minimization of Power and Area in Scan Testing by Scan Cell Reordering", Proc., IEEE Symposium on VLSI, 20-21 Feb. 2003, pp. 246 – 249
- [15] S. Makar, "A Layout-Based Approach for Ordering Scan Chain Flip\_Flops", Proc., IEEE International Test Conference, 1998, pp. 341 – 347
- [16] Y. Bonhomme, P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, "*Efficient Scan Chain Design for Power Minimization during Scan Testing under Routing Constraint*", Proc., IEEE International Test Conference, 2003, pp. 488 – 493.