# Low-Overhead Design of Soft-Error-Tolerant Scan Flip-Flops with Enhanced-Scan Capability\*

Ashish Goel, Swarup Bhunia<sup>1</sup>, Hamid Mahmoodi<sup>2</sup>, and Kaushik Roy

Dept of ECE, Purdue University, West Lafayette, IN-47907, USA, email: <ashishg, kaushik>@ecn.purdue.edu <sup>1</sup>Dept of EECS, Case Western Reserve University, Cleveland, OH-44106, USA, email: Swarup.Bhunia@case.edu <sup>2</sup>School of Engineering, San Francisco State University, San Francisco, CA-94132, USA, email: mahmoodi@sfsu.edu

*Abstract* – With technology scaling, soft error resilience is becoming a major concern in circuit design. This paper presents a class of low-overhead flip-flops suitable for soft error detection and correction. The proposed design reuses logic elements typically available in a standard-cell implementation of a flip-flop to reduce hardware overhead. We demonstrate that the proposed flip-flops are also suitable for enhanced scan based delay fault testing, which allows arbitrary two-pattern test application for the best combinational path testability. The proposed flip-flops show an average power reduction of 16% and area improvement of 17% compared to the best alternative techniques with no additional delay overhead.

# I. INTRODUCTION

Device dimensions are scaled down aggressively every technology generation. With scaling, the area per bit scales down and is about  $1\mu m^2$  in a 90nm technology SRAM cell [1]. This results in reduced average node capacitance. To maintain the electrostatics of the device and prevent breakdown caused by high electric fields, the supply voltage has also been scaled down with transistor dimensions. The net effect of reduction in node capacitance and the supply voltage is that the amount of charge stored at a particular node is going down every generation. This results in increased susceptibility of latches and flip-flops to radiation induced soft errors [1].

Soft errors are caused by alpha particles or neutrons emitted by packaging materials, and cosmic rays from deep space [1]. The circuit however, is not permanently damaged by these radiations. The energetic particles create minority carriers that are collected by parasitic source and drain diodes. This collected charge causes a transient voltage pulse and is often large enough to alter the logic state of a node. If the voltage fluctuation is smaller than noise margin, then the circuit will continue to perform properly. With the reduction in node capacitance, the *same-energy particle* can create a larger voltage fluctuation at a node. Hence, the susceptibility of a circuit to soft errors increases every technology generation.

Soft error affects memory, sequential elements as well as logic. Conventionally, ECC (Error

Correcting Code) is used to detect and correct any soft errors in SRAM memories [2]. However, latches and flip-flops are more difficult to protect using parity and error correcting codes [3] [4]. Several hardening techniques have been proposed to increase the soft error resilience of flip-flops and latches [3] [5]. These techniques rely on increasing the node capacitance and using redundancy to improve the tolerance to soft errors. However, these techniques come at the cost of large delay and area overhead.

In today's design, all flip-flops and latches are expected to be in a scan path so that efficient testing (allowing easy access to the internal nodes of a logic block) is possible. Hence, it is important to consider integration of any soft error tolerant flip-flop or latch into the scan path. Enhanced scan is a delay fault test method, which allows easy application of state transition and enables deterministic choice of any launching pattern in the scan flip-flops for the best possible combinational path testability [6] [7]. A recently proposed scan design [8], referred as HSSG (Hold Scan using Scan Gadget), uses a *scan gadget element* along with the system latch to implement enhanced scan based delay testing.

As mentioned above, it is becoming necessary to have soft error resilience as well as delay fault testing ability in a system flip-flop. In [9], authors have proposed a technique where they use the existing onchip scan design-for-testability resources for soft error protection during normal operation (referred to as ISR technique in this paper).

In this paper, we propose two novel flip-flop designs that allow:

- Soft error detection/correction along with enhanced scan delay testing with lower area and power overhead compared to the existing approach (ISR).
- Arbitrary two-pattern test application (i.e. enhanced scan based delay testing) with lower design overhead compared to HSSG.

A typical standard-cell flip-flop has buffers to drive the output stage [12]. These drivers are sized up to drive a high number of fanouts. We make use of these drivers and convert them into a latch. This modification helps us in reducing the number of

<sup>\*</sup> This work is sponsored in part by Marco Gigascale Systems Research Center(GSRC) and Semiconductor Research Corp.

latches from four (as required in HSSG and ISR) to three for implementing the scan latch and system flipflop. The proposed flip-flops are suitable for use in muxed-scan designs. Experiments performed on ISCAS89 benchmarks show better results over the competing techniques in terms of area and power overhead.

The rest of the paper is organized as follows: Section II shows the proposed flip-flop design for soft error detection and correction. Section III illustrates the design to be used for two-pattern delay testing. Section IV presents the simulation results in terms of area, delay and power for a set of benchmark circuits. Section V concludes the paper.

# **II. SOFT ERROR DETECTION/CORRECTION**

Soft error can result in a change in the state of a particular node in a circuit. If soft error occurs at a node of a latch, it can lead to a wrong value being stored in the latch. In a master-slave flip-flop, soft error can occur either in the master stage or in the slave stage. If soft error occurs in any one of them, it will corrupt the data stored in the flip-flop. Both master and slave stages are susceptible to soft errors during the data storing state of the clock. Master stage stores the data when the clock is '1' and slave stage stores the data when the clock is '0'.

One of the basic methodologies for soft error detection/correction is to have a redundant copy of the stored data. The redundant copy is used to determine whether a soft error has occurred or not. This is based on the assumption that only one of the copies of data is affected by soft error. Self correcting flip-flops have been designed using hardening techniques [3]. These schemes use data redundancy and feedback topology to correct the value stored in the event of a soft error. However, the hardware used for redundancy is not used for any other functions and add to the area, power and delay overhead.

Delay test schemes like HSSG have redundant scan resources that are unused during normal mode of operation. These scan resources add to the area of the chip as well as to the leakage power in normal mode of operation. In [8], the authors proposed a method, where they use these scan resources as a shadow of the system flip-flop However, the design uses four latches (two for system flip-flop and two for scan flip-flop) to store the redundant copy of the data. Therefore, it has considerable area overhead.

### II. a. Soft Error Detection

Pipelines having flush capability can work fine with soft error detection only. Once a soft error is detected in a particular stage of the pipeline, the pipeline can be stalled and started from a particular check point. In such kind of a design, soft error correction is not required. Fig. 1 shows the proposed flip-flop design having enhanced scan and soft error detection capability. To reduce the overhead involved, we convert the inverters used to drive the outputs  $(I_1 and$  $I_2$ ) into a latch. This is done to store a copy of the data stored in the flip-flop. We also combine the master stages of the scan flip-flop and the system flip-flop into one master stage. Using this strategy we need just three latches to store the two copies of the data. The two drivers I1 & I2 are converted into a latch using transmission gates T<sub>2</sub>, T<sub>3</sub>, T<sub>4</sub> and T<sub>5</sub>. The multiplexer at the input of the flip-flop is controlled by the Test Control (TC) signal. The multiplexer selects the input SI (which is connected to output SO of the previous scan flip-flop) in the test mode (TC = '0') to enable the loading of the scan-chain. During normal mode of operation (TC = '1'), the multiplexer selects the input D coming from the previous stage. The transmission gate  $(T_6)$  is added in the circuit to separate latch  $L_2$ from  $L_3$ . The transmission gates  $T_3$  and  $T_4$  are controlled by a signal HOLD which is generated using Test Control signal (TC) and clock. During test mode (TC = '0'), HOLD becomes '0, turning the transmission gate T<sub>3</sub> OFF and T<sub>4</sub> ON, thereby closing latch L<sub>3</sub>. As T<sub>3</sub> is OFF in test mode, L<sub>3</sub> gets disconnected from L<sub>1</sub> and L<sub>2</sub>. Latch L<sub>3</sub> thus acts as a hold latch during test mode. The first test vector is scanned in serially in all the flip-flops. When scanshifting is completed for the first pattern, it is applied to the combinational circuit by making  $TC = 1^{\circ}$ . After the combinational circuit stabilizes, the second pattern is scanned-in and the first pattern is held in the hold latch L<sub>3</sub>. Next, the transition is launched by making TC



Fig. 1. Flip Flop design with Enhanced Scan Capability and Soft Error Detection (ESFF-SED)

= '1' and the results are latched after one rated clock period. During normal mode of operation (TC = '1'), HOLD is a delayed and inverted version of the clock signal. An XOR gate is used to detect the soft error by comparing the voltages at nodes Q and SO. To keep the area overhead minimum, we remove the inverter in a normal XOR gate design and use  $\overline{Q}$  which is already available in the latch L<sub>3</sub>.

We need to detect the occurrence of soft error in master and slave stage. To do this, we need to sample the data from master stage and store it in latch  $L_3$ . This sampling has to be done at the rising edge of the clock and we need to disconnect  $L_3$  from  $L_1$  and  $L_2$ for the rest of the clock period. This has to be done to prevent a soft error in any of these latches from affecting the data stored in L<sub>3</sub>. This functionality can be achieved by generating a pulse at the rising clock edge. This pulse can control the transmission gate at the input of latch L<sub>3</sub>. However, pulse generation is usually not desired in a circuit because of the large power overhead involved. To achieve the required functionality we use the concept of implicit pulsing [10]. An implicit pulse can be generated using two transmission gates connected in series. The controlling signals of the gates are designed in such a way that both the gates are ON together only for a short period of time t<sub>d</sub>. Thus, there is a direct path from the input to the output only during the time interval t<sub>d</sub>. Using this method we can achieve the functionality of a pulse of duration t<sub>d</sub> without generating the pulse signal explicitly.

In our circuit, implicit pulse is generated by the combination of transmission gates  $T_2$  and  $T_3$ .  $T_2$ turns ON at the rising edge of the clock and starts propagating. As HOLD is just a delayed and inverted



Fig. 2. Flip Flop design with Enhanced Scan Capability and Soft Error Correction (ESFF-SEC)

version of the clock,  $T_3$  is ON at the rising edge of the clock and it turns OFF after a small time delay. This time delay is determined by the delay of the AND gate used in generating the HOLD signal. During this time period after the rising edge of the clock, both  $T_2$  and  $T_3$ are ON and there exists a path between  $L_1$  and  $L_3$ . Transmission gates T<sub>4</sub> and T<sub>5</sub> are OFF during this period of time, thus opening latch L<sub>3</sub>. At the rising edge of the clock the value stored in the master stage is sampled by L<sub>3</sub>. After T<sub>3</sub> turns OFF, L<sub>3</sub> is disconnected from  $L_1$ . During rest of the clock cycle, either  $T_3$  or  $T_2$ is OFF while T<sub>4</sub> or T<sub>5</sub> is ON keeping the cross coupled inverter loop ( $I_1$  and  $I_2$ ) closed. When clk='0',  $T_5$  is ON and  $T_4$  is OFF and vice versa when clk = '1'. We compare our design to the ISR design with the Celement removed and an XOR gate added for detecting soft error (referred to as ISR-WC).

#### **II. b. Soft Error Correction**

The method described in [9] (ISR) inserts a C-element after the flip-flops. The C-element compares the output of the shadow latch and the system flip-flop. If the values are same it just functions as an inverter. If the two values are different it blocks the propagation of the wrong value. A keeper is used after the C-element to store the value. This design has a very good soft error tolerance as it does not allow the wrong value to propagate. However, the shadow latch has to be upsized to match the timing of the system flip-flop otherwise the delay of the flip-flop will be the delay of the slower flip-flop. This leads to a large area and power overhead.

We use the C-element and include it in our design as shown in Fig. 2. Inverters  $I_1$  and  $I_2$  are no longer acting as drivers because they are no longer driving the output load. Thus they are sized down to reduce the overhead. Table 1 shows the comparison between the proposed (ESFF-SEC) design and the ISR design. The proposed flip-flop has a 20% improvement in power for the same C-to-Q delay. This is due to reduction in the number of latches from four to three. The setup time also shows an improvement of 40%. This is due to upsizing of the shadow latch in ISR flip-flop. Upsizing increases the load on the data driver which results in an increased setup time.

As described in the last sub-section, latch  $L_3$  is

 Table I. Normalized Power, Delay and Area

 comparison for flip-flops used for soft error correction

|              | ESFF-SEC | ISR FF |
|--------------|----------|--------|
| Power        | 1        | 1.26   |
| C-to-Q delay | 1        | 1      |
| Setup time   | 1        | 1.7    |
| Area         | 1        | 1.61   |



Fig. 3. Scan Gadget Scheme (HSSG)

open to the master stage during a small period of time fter the rising edge of clock. During this time, if soft error occurs in the master stage, then wrong value is written into  $L_2$ . In such a case, both the copies of data ( $L_1$  and  $L_3$ ) are wrong and the flip-flop is not able to detect the soft error. However, this time period  $t_d$  is very small as compared to the clock period. Thus the overall soft error resilience does not change much and is comparable to the ISR design.

# III. ENHANCED SCAN APPROACH TO DELAY TESTING

The previous flip-flop design can provide delay fault testing capability along with soft error resilience in terms of detection and correction. However, adding the soft error resilience to all the flip-flops in the system to achieve 100 percent soft error protection is not really needed for most ground-level applications [9]. If soft error resilience is not a requirement, then the design proposed in Fig.1 and Fig. 2 can be simplified to include only delay fault testing functionality.

Enhanced scan based delay fault testing requires application of a transition at the state inputs of a combinational block by holding its output state in response to the initial pattern before applying the second pattern. HSSG uses a scan gadget along with the system latch as shown in Fig. 3 [8]. The scan chain is implemented by scan gadget element, which



Fig. 4. Flip Flop design with Enhanced Scan Capability (ESFF)

provides the basic test functions (shift, load and capture). This design has an overhead of using two extra latches, adding to the leakage power and area overhead. Moreover, in this scheme the system flip-flop has a more complicated design, with two clocks and two data inputs, one for system operation and the other for loading to/from the scan chain. The extra circuitry adds to the flip-flop power in the normal mode of operation.

Fig. 4 shows the proposed master-slave flipflop design with enhanced scan capability. The two inverters I<sub>1</sub> & I<sub>2</sub> are converted into a latch using transmission gates T<sub>3</sub> and T<sub>4</sub>. Latch L<sub>3</sub> acts as a hold latch during test mode and drives the combinational circuit. Transmission gates T<sub>3</sub> and T<sub>4</sub> combined together, acts as the controlling circuit to enable operation of the circuit during test mode and normal mode. The operation of the flip-flop is as follows: In normal mode of operation (TC = '1'), transmission gate T<sub>4</sub> is OFF and T<sub>3</sub> is ON. The inverters I1 and I2 act as drivers and propagate the value stored in the slave latch  $L_2$ . During the test mode of operation (TC = '0'), T<sub>3</sub> is OFF and T<sub>4</sub> is ON. This converts the two inverters ( $I_1$  and  $I_2$ ) into a cross coupled loop to form the hold latch L<sub>3</sub>. As T<sub>3</sub> is open, L<sub>3</sub> gets disconnected from latch  $L_1$  and  $L_2$ .

The ESFF design does not require any extra timing control signals as required in a conventional enhanced scan test method. It only uses the test control signal (TC) and its complement ( $\overline{TC}$ ), which are used in conventional scan-based testing. The proposed design does not involve the overhead due to extra latches and the complex system flip-flop used in HSSG. It is worth noting that the proposed design also maintains the power-saving advantage of enhanced scan in the test mode, since it prevents redundant switching in the combinational block by isolating it from the activity in scan register.

#### IV. SIMULATION RESULTS AND COMPARISON

To estimate the effectiveness of the proposed designs, we simulated a set of ISCAS89 benchmark circuits and obtained area, power, and performance overhead in case of proposed design and competing techniques. The simulations were performed using the 70nm BPTM models [11] to observe the effects in a sub-100nm scaled technology. The gate-level netlists were first technology-mapped to *LEDA*  $0.25\theta$ m standard cell library using Synopsys design compiler by setting the mapping effort to medium. The library contains complex gate types such as "aoi" (and-or-invert) and "mux", and hence, the total number of logic gates is reduced from that in original benchmark. The benchmark circuits are then translated to *Hspice* netlists and scaled to 70nm. We assumed full-scan

|                    |                  |        | Power        |                      | Delay      |              |                      | Area   |              |                          |
|--------------------|------------------|--------|--------------|----------------------|------------|--------------|----------------------|--------|--------------|--------------------------|
| ISCAS89<br>Ckt Flo | # Flip-<br>Flops | ISR-WC | ESFF-<br>SED | %imp. over<br>ISR-WC | ISR-<br>WC | ESFF-<br>SED | %imp. over<br>ISR-WC | ISR-WC | ESFF-<br>SED | %imp.<br>over ISR-<br>WC |
| s298               | 14               | 0.837  | 0.655        | 21.74                | 23.648     | 23.600       | 0.20                 | 0.878  | 0.723        | 17.69                    |
| s344               | 15               | 0.914  | 0.719        | 21.32                | 27.924     | 27.88        | 0.169                | 0.9541 | 0.787        | 17.45                    |
| s641               | 19               | 1.054  | 0.808        | 23.41                | 51.139     | 51.091       | 0.092                | 1.247  | 1.036        | 16.91                    |
| s838               | 32               | 1.773  | 1.358        | 23.44                | 52.813     | 52.766       | 0.089                | 2.087  | 1.732        | 17.01                    |
| s1196              | 18               | 1.563  | 1.3297       | 14.96                | 40.077     | 40.030       | 0.118                | 1.859  | 1.659        | 10.74                    |
| s1423              | 74               | 4.604  | 3.642        | 20.88                | 100.00     | 99.953       | 0.047                | 4.648  | 3.827        | 17.67                    |
| s5378              | 179              | 10.514 | 8.188        | 22.12                | 32.312     | 32.265       | 0.146                | 10.913 | 8.928        | 18.20                    |
| s9234              | 211              | 12.304 | 9.562        | 22.28                | 45.044     | 44.997       | 0.105                | 13.315 | 10.975       | 17.58                    |
| s13207             | 638              | 36.182 | 27.891       | 22.91                | 56.259     | 56.212       | 0.084                | 26.522 | 19.443       | 26.69                    |
| s15850             | 534              | 30.740 | 23.80        | 22.57                | 67.122     | 67.075       | 0.070                | 25.174 | 19.249       | 23.53                    |
| s35932             | 1728             | 100.00 | 77.55        | 22.45                | 22.809     | 22.762       | 0.207                | 100.00 | 80.827       | 19.17                    |

Table II. Comparison of Power, Delay and Area for ISR-WC and ESFF-SED (Normalized to scale of 100)

implementation of the benchmarks. Power is measured in *NanoSim* by applying 100 random vectors to the inputs and delay is measured by *Hspice* simulation of the critical path of a circuit. Since the layout rules for the 70nm node are not available, the measure used for area is the total transistor active area ( $W\psi \ll L\psi$  for a transistor).

Table II shows comparison of power, delay and area for our proposed soft error detection design (ESFF-SED) (Fig. 1) with ISR-WC. All the power, delay and area values have been normalized with respect to the maximum value among all the benchmarks respectively (where maximum value is given the value of 100). In the ESFF-SED design, on an average we observe a 21% improvement in power and 17% improvement in area over ISR-WC with no additional delay overhead. However, it is interesting to note that improvement in area over ISR-WC drops from 26.69% in benchmark s13207 down to 10.74% in benchmark s1196. This is due to the fact that sequential elements contribute to just 22% of the total area in benchmark s1196, whereas they contribute to about 70% of the total area in s13207. Similarly power improvement also drops down to 14.96% in benchmark s1196 from 22.91% in benchmark s13207. The delay is comparable in both the designs as the path for C-to-Q delay is same for both the flip-flops. This is because both the designs have the same number of gates in the C-to-Q delay path.

|                | #<br>Flip-<br>Flops | Power  |              |                      | Delay  |              |                      | Area   |              |                   |  |
|----------------|---------------------|--------|--------------|----------------------|--------|--------------|----------------------|--------|--------------|-------------------|--|
| ISCAS89<br>Ckt |                     | ISR    | ESFF-<br>SEC | %imp.<br>over<br>ISR | ISR    | ESFF-<br>SEC | %imp.<br>over<br>ISR | ISR    | ESFF-<br>SEC | %imp.<br>over ISR |  |
| s298           | 14                  | 0.836  | 0.701        | 16.21                | 25.625 | 25.617       | 0.031                | 0.869  | 0.704        | 18.99             |  |
| s344           | 15                  | 0.913  | 0.768        | 15.90                | 29.790 | 29.782       | 0.026                | 0.943  | 0.766        | 18.76             |  |
| s641           | 19                  | 1.055  | 0.871        | 17.43                | 52.404 | 52.396       | 0.015                | 1.228  | 1.004        | 18.25             |  |
| s838           | 32                  | 1.775  | 1.465        | 17.45                | 54.034 | 54.027       | 0.015                | 2.058  | 1.680        | 18.34             |  |
| s1196          | 18                  | 1.553  | 1.378        | 11.23                | 41.629 | 41.621       | 0.019                | 1.759  | 1.547        | 12.07             |  |
| s1423          | 74                  | 4.597  | 3.880        | 15.59                | 100.00 | 99.993       | 0.007                | 4.603  | 3.731        | 18.97             |  |
| s5378          | 179                 | 10.511 | 8.778        | 16.49                | 34.065 | 34.057       | 0.023                | 10.846 | 8.734        | 19.47             |  |
| s9234          | 211                 | 12.302 | 10.259       | 16.73                | 46.468 | 46.460       | 0.032                | 13.181 | 10.691       | 18.89             |  |
| s13207         | 638                 | 36.197 | 30.019       | 16.61                | 57.391 | 57.384       | 0.017                | 27.789 | 20.261       | 27.09             |  |
| s15850         | 534                 | 30.743 | 25.572       | 16.82                | 67.974 | 67.966       | 0.012                | 25.872 | 19.571       | 24.35             |  |
| s35932         | 1728                | 100.00 | 83.266       | 16.73                | 24.808 | 24.800       | 0.014                | 100.00 | 79.610       | 20.39             |  |

 Table III. Comparison of Power, Delay and Area for ISR and ESFF-SEC (Normalized to scale of 100)

|                |                  | Power  |        |                       | Delay  |        |                       | Area   |        |                       |
|----------------|------------------|--------|--------|-----------------------|--------|--------|-----------------------|--------|--------|-----------------------|
| ISCAS89<br>Ckt | # Flip-<br>Flops | HSSG   | ESFF   | %imp.<br>over<br>HSSG | HSSG   | ESFF   | %imp.<br>over<br>HSSG | HSSG   | ESFF   | %imp.<br>over<br>HSSG |
| s298           | 14               | 0.845  | 0.638  | 24.4                  | 23.128 | 23.093 | 0.15                  | 0.831  | 0.719  | 13.5                  |
| s344           | 15               | 0.928  | 0.707  | 23.8                  | 27.433 | 27.398 | 0.13                  | 0.904  | 0.784  | 13.3                  |
| s641           | 19               | 1.040  | 0.759  | 26.9                  | 50.806 | 50.771 | 0.07                  | 1.184  | 1.032  | 12.9                  |
| s838           | 32               | 1.748  | 1.276  | 27.0                  | 52.491 | 52.457 | 0.07                  | 1.981  | 1.724  | 13.0                  |
| s1196          | 18               | 1.728  | 1.462  | 15.4                  | 39.669 | 39.634 | 0.09                  | 1.816  | 1.671  | 8.0                   |
| s1423          | 74               | 4.705  | 3.612  | 23.2                  | 100.00 | 99.970 | 0.03                  | 4.400  | 3.807  | 13.5                  |
| s5378          | 179              | 10.563 | 7.919  | 25.0                  | 31.851 | 31.816 | 0.11                  | 10.306 | 8.870  | 13.9                  |
| s9234          | 211              | 12.333 | 9.217  | 25.3                  | 44.670 | 44.635 | 0.07                  | 12.610 | 10.912 | 13.4                  |
| s13207         | 638              | 35.948 | 26.525 | 26.2                  | 55.961 | 55.926 | 0.06                  | 24.275 | 19.154 | 21.1                  |
| s15850         | 534              | 30.688 | 22.801 | 25.7                  | 66.898 | 66.863 | 0.05                  | 22.985 | 18.700 | 18.6                  |
| s35932         | 1728             | 100.00 | 74.500 | 25.5                  | 22.284 | 22.249 | 0.15                  | 100.00 | 80.200 | 19.8                  |

 Table IV. Comparison of Power, Delay and Area for HSSG and ESFF (Normalized to scale of 100)

Table III compares the power, delay and area for the proposed soft error correction design (ESFF-SEC) (Fig. 2) with ISR. On an average, we get a 16% improvement in power and 18% improvement in area. The delay is again comparable in both the designs. The power savings goes down from 21% in ESFF-SED to 16% in ESFF-SEC. This can be attributed to the fact that with the addition of the C-element to both the designs the percentage savings in a single flip-flop in ESFF-SEC over ISR goes down. As a result, the overall savings also decrease.

Table IV compares the power, delay and area for the ESFF design and HSSG design. On an average we get a 24% improvement in power and 14% improvement in area at no additional delay overhead. The results indicate the effectiveness of the proposed flip-flops in reducing the overhead involved with addition of soft error tolerance and enhanced scan capability in real circuits.

# V. CONCLUSION

In this paper, we have proposed novel flip-flop designs, which are soft error resilient and at the same time have enhanced scan based delay fault testing capability. A simplified version of the flip-flop with enhanced scan delay fault testing capability alone is also presented, which can be used in application where soft error resilience is not required. The proposed designs achieve low overhead by utilizing existing hardware resources in a typical flip-flop to realize soft error detection/correction and enhanced-scan-like delay fault testing. Compared to the existing techniques, the proposed designs have considerably less power and area overhead with no additional delay penalty.

#### **VI. REFERENCES**

[1] T. Karnik, P. Hazucha, J. Patel, "Characterization of Soft Errors Caused by Single Event Upsets in CMOS Processes," *IEEE Transactions on Dependable and Secure Computing*, Vol. 1, No. 2, April-June 2004.

[2] P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger, L. Alvisi, "Modeling the effect of technology trends on the soft error rate of combinational logic," *International Conference on Dependable Systems and Networks*, 2002, pp. 389-398.

[3] P. Hazucha et al, "Measurements and Analysis of SER-Tolerant Latch in a 90-nm Dual- $V_T$  CMOS Process", *IEEE Journal of Solid-State Circuits,* Vol. 39, No. 9, September 2004.

[4] R. Ramanarayanan, V. Degalahal, N. Vijaykrishnan, M.J. Irwin, D. Duarte, "Analysis of Soft Error Rate in Flip-Flops and Scannable Latches," *IEEE SOC Conference*, Sept. 2003, pp. 231-234.

[5] Q. Zhaou, K. Mohanram, "Cost-Effective Radiation Hardening Technique for Combinational Logic," *ICCAD*, 2004, pp. 100-106.

[6] W. Mao et al., "Reducing correlation to improve coverage of delay faults in scan-path design," *IEEE Transactions on CAD*, Vol. 13, No. 5, May 1994 pp. 638-646.

[7] M. L. Bushnell and V. D. Agrawal, *Essentials of Electronic Testing for Digital, Memory, and Mixed-Signal VLSI Circuits*, Kluwer Academic Publishers, 2000.

[8] R. Kuppuswamy et al., "Full Hold-Scan Systems in Microprocessors: Cost/Benefit Analysis," *Intel Technology Journal*, Vol.8, Issue 1, Feb. 2004.

[9] S. Mitra, N. Seifert, M. Zhang, Q. Shi, K. Kim. "Robust System Design with Built-In Soft-Error Resilience," *Computer*, vol. 38, No. 2, Feb. 2005, pp. 43-52.

[10] J. Tschanz et al, "Comparative Delay and Energy of Single Edge-Triggered & Dual Edge-Triggered Pulsed Flip-Flops for High-Performance Microprocessors," *ISLPED*, 2001, pp. 147-152

[11] University of California, Predictive Technology Model,

http://www.device.eecs.berkeley.edu/~ptm, 2001.

[12] Artisan Standard Cell Library for 0.13-micron TSMC process, <u>http://www.artisan.com/products/standard\_cell.html</u>