# 250-600 MHz 12b Digital Filters in 0.8-0.25um Bulk and SOI CMOS Technologies Lars E. Thon, Gerald P. Coleman, Werner Rausch(#), Dominic Schepis(#), Ronald Schulz(#), Fariborz Assadaraghi(\*), Ghavam G. Shahidi(\*) and Denny D. Tang IBM Almaden Research Center, San Jose, CA 95120 (#) IBM Microelectronics, Hopewell Junction, NY 12533 (\*) IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 ### **Abstract** This paper describes a family of high-speed Finite Impulse Response (FIR) digital filters that have been scaled across three generations of CMOS processes. The processes include commercial varieties as well as experimental bulk and low power Silicon-On-Insulator (SOI) technologies. Wafer tests demonstrate the speed and power advantages of the experimental SOI technology when applied to a 24k-device digital signal processing (DSP) function, by direct comparison with the same filter manufactured in conventional (bulk) CMOS. The filters presented here have 8 taps and operate on 6b data with 6b programmable coefficients and deliver 12b fixedpoint output values, but the layout design is modular so that the number of taps and bits can be changed easily. The filters operate with maximal speeds of 250-600MHz depending on the fabrication technology, and are suitable for the equalization task in Partial Response magnetic recording channels. ## Introduction The high speed and low power consumption properties of submicron SOI circuits have been demonstrated for functions such as frequency dividers[1], dual modulus prescalers and programmable Phase-Locked Loops[2], as well as 4-bit A/D conversion based on a ladder-less design with comparators that have individually adjusted threshold voltages[3]. These examples represent low-to-medium levels of integration, with device counts such as 8(dividers), 370(ADC) and 2400(PLL). Experiments at higher levels of integration have been reported for SRAM memory designs[4]. In this paper, we report on the impact of SOI technology on a relatively complex (24000 device) signal processing function and compare the results with those obtained in bulk technologies using the same design. The demands of PRML hard disk read channels[5] have created a substantial market for high speed and low energy FIR-type equalizers[6][7]. The 60% yearly increases in recording density drive up the channel clock speed at a rapid pace and provide an economical advantage to FIR designs that can scale across multiple generations of CMOS processes. We describe the design and measured performance of such a family of FIR filters. The filters have been fabricated in IBM's cmos4s[8], cmos5x[9] and experimental Bulk/SOI technologies. ## FIR design Because the equalizer of a recording channel is part of gain and timing control loops, it is crucial that the latency of the filter is as low as possible. Also, because the filter response requirement is a function of the radial head position at the disk, the filter must have programmable coefficients. Finally, the filter must be modular in the the number of taps so that it can be reused under varying system demands and the tap count can be selected late during the concurrent system design process. The Direct Form II filter architecture (Fig. 1) allows us to fulfill the above requirements while achieving a total latency of only 2 clock cycles. The 6b multipliers use Booth encoding to reduce the number of partial products from 6 to 3, while exploiting the semi-static nature of the programmable coefficients to remove the encoding delay from the critical path. Carry-save addition is used in all the taps. Multiplication and addition is merged and has a latency of 1 cycle. The 2nd cycle of the total latency is the final adder stage that adds the carry and sum vectors from tap 0. This vector-merge adder matches the tap mult/add cycle speed at minimal cost by utilizing a ripple-carry approach with even/odd carry optimization and careful tuning of the carry path circuits (Fig. 2). Special consideration has been given to manufacturing testability: The latches in the filter are static scanmux type latches (Fig. 3) connected for complete controllability and observability of all memory elements. The latches use a reverse-propagate (slave towards master) local clock splitter to produce the two phases of the clock, which has the beneficial effect of canceling out most of the setup-time increase caused by the delay of the scanmux located at the master latch input. Other details of the circuit and logic design can be found in [6]. The filter layouts contain no substrate contacts. The resulting speed penalty is incurred for the sake of avoiding noise injection from the digital ground rail into the substrate of a mixed-signal chip. ## Test chips and test setup The test chips for the FIR filters use different I/O circuits and pad arrangements for the cmos4s and the cmos5x/experimental versions. The cmos5x design is shown in Fig. 4. The filter data and clock inputs are driven by Schmitt-trigger receivers; several of them in parallel in the case of the clock driver. The data outputs use high speed pseudo-differential output drivers where the "negative" output of all the drivers are tied together and brought to a common return-ground pad (Fig. 5). A bias current is selected off-chip to adjust the current levels in the differential pairs. A typical bias value is 4mA for 200mV swing into a 50ohm load. The coefficient register and the scan path are driven by and observed through conventional single-ended receivers and drivers. The I/O circuits and the filter core are divided into 5 separate power domains so that one can accurately measure the filter power consumption as well as adjust the core and I/O voltages independently if necessary. Measurements are based on wafer probing using a 500hm controlled impedance probe card and a 620MHz pattern generator, as shown in Fig. 6. Fig. 7 is a sampling oscilloscope snapshot of an SOI filter operating at 564MHz with a 2.5V supply voltage. ## **Experimental Bulk and SOI technology** The cmos4s and cmos5x technologies are commercial processes described in more detail in [8][9]. The experimental CMOS technology is of the quarter-micron class, and has been applied to both regular bulk and SOI wafers. Fig. 8 shows a cross section of the SOI technology. Table 1 shows the key technology parameters. The experimental wafers were made with the same mask set as the cmos5x wafers, hence the same drawn gate length of 0.5um are listed for both. Because of the thinner gate oxide of the experimental process, t(ox)=~5nm, the gate capacitance for same-size transistors is higher than in the cmos5x technology. With the masks being the same, the total capacitance switched during operation of the chip is higher by a factor of roughly 7/5 in the case of the experimental bulk wafers, or 40%. On the other hand, the drive of each transistor is inversely proportional to t(ox), leading us to expect about the same speed from an experimental bulk wafer as from a cmos5x wafer. In the case of the SOI wafers, additional speed gain is expected due to the large reduction in the source and drain capacitances. ### **Performance** Measurements were made on a number of cmos4s, cmos5x and experimental bulk and SOI wafers. Fig. 9 shows the maximal speed f(max) of some filters, as a function of the supply voltage Vdd. Correct operation up to 620MHz was observed. The cmos5x mask set contains ring oscillators that were used to gauge the overall performance of each wafer. Fig. 10 shows the speed of ring oscillators equipped with divider circuits. The ring oscillators are based on inverters and work down to a lower supply voltage than the digital filters. The reason for this behavior is not known but may be due to the FIR cores not being capable of switching the I/O drivers at low voltages. Fig. 11 shows the correlation between ring speed and filter speed, expressed by plotting x=fmax(ring) and y=fmax(filter) using Vdd as the independent parameter. The correlation curves have a characteristic "S" shape: The bottom left flat part represents the low-Vdd case where the ring is operational but not the filters. The linear region shows good correlation between ring speed and filter speed. The flat region at the top represents the limitation of the test equipment (620 MHz) in driving the filter inputs. The energy efficiency of the filter can be defined as the energy per operation (computed output sample). Fig. 12 shows the energy consumption of the filters as a function of the maximal operating frequency, with Vdd as the underlying parameter. The SOI wafers are considerably more energy efficient than the cmos5x wafers, by as much as a factor of 3 at the 400MHz operating point. Much of this gain is due to the much lower Vdd required to obtain a given speed for the SOI wafers. At the *same* Vdd, the SOI wafers use slightly more energy than the bulk cmos5x wafer. This is due to fac- tors alluded to earlier: (1) a 7/5 factor increase in gate capacitance per unit area, combined with (2) using the same mask as the cmos5x wafers (no shrink). With the proper shrink in place, the power consumption for the SOI will be lower for the same Vdd and f(clk), but again even more significant gains can be obtained via the lower Vdd required to obtain a given speed for the SOI circuits. ### Conclusion We have described a family of custom layout 12b CMOS digital filters that are portable across three generations of process technology. Clock rates above 600MHz have been observed for 8-tap SOI versions of the filters. The energy consumption for SOI filters was less than 1/2 that of plain CMOS filters with the same drawn gate-length when both operate with their optimal Vdd and at the same speed. The dominant effects for low-power operation of SOI relative to bulk CMOS were the ability to lower the Vdd for the same clock rate and the reduction in source/drain junction capacitance. ## Acknowledgements The authors would like to thank P.Sutardja, F.Lai and K.Wrenner for their I/O circuit designs, IBM Rochester dept. HDCA/54 for assembly of the cmos4s test site, IBM Burlington depts. B53V/N98V/N02V for cmos5x assistance, and R.Lynch, M.Chen and R.Scranton for their support. #### References [1] M.Fujishima, K.Asada, Y.Omura and K.Izumi: Low-Power 1/2 Frequency Dividers Using 0.1um CMOS Circuits Built with Ultrathin SIMOX Substrates. IEEE J. of Solid-State Circuits, Vol.28, p.510-512, Feb 1993. [2] Y.Kado, M.Suzuki, K.Koike, Y.Omura and K.Izumi: A 1-GHz/0.9-mW CMOS/SIMOX Divide-by-128/129 Dual-Modulus Prescaler Using a Divide-by-2/3 Synchronous Counter. IEEE JSSC, Vol.28, p.513-517, Feb 1993. [3] R.H.Walden, A.E.Schmitz, A.R.Kramer, L.E.Larson and J.Pasiecznik: A Deep-Submicrometer Analog-to-Digital Converter Using Focused-Ion-Beam Implants. IEEE JSSC, Vol.25, p.562-571, Feb 1993. [4] G.G.Shahidi, C.A.Anderson, B.A.Chappel, T.I.Chappel, J.H.Comfort, B.Davari, et al: A Room Temperature 0.1 um CMOS on SOI. IEEE Transactions on Electron Devices, Vol.41, p.2405-2412, Dec 1994. [5] R.D.Cideciyan, F.Dolivo, R.Hermann, W.Hirt and W.Schott: A PRML system for digital magnetic recording, IEEE J. on Selected Areas in Communications, p.38-56, Jan 1992. [6] L.Thon, P.Sutardja, F.Lai, G.Coleman: A 240 MHz 8-tap Programmable FIR Filter for Disk Drive Read Channels. Proc. IEEE International Solid State Circuits Conference, p. 82-83, San Francisco, Feb 1995. [7] D.Xu, Y.Song, G.T.Uehara: A 200MHz 9-Tap Analog Equalizer for Magnetic Disk Read Channels. Proc. IEEE International Solid State Circuits Conference, p.74-75, San Francisco, Feb 1996. [8] IBM Corporation: CMOS4S technology data. See http://www.chips.ibm.com/products/ams/tech/amstqr.html. [9] C.W.Koburger, W.F.Clark, et al: A half-micron CMOS logic generation. IBM J. Research and Development, pp.215-227, Vol.39 No.1/2 January/March 1995. See also http://www.chips.ibm.com/products/asics/tech/cmos5x/cmos5x.html. | Parameter | Technology | | | |--------------|------------|---------|----------| | | cmos4s | cmos5x | experim | | L(drawn) | 0.80 um | 0.50 um | 0.50 um | | L(effective) | 0.45 um | 0.25 um | ~0.25 um | | T(ox) | 12 nm | 7 nm | ~5 nm | | Metal layers | 3 | 3 | 3 | Table 1: Technology Parameters. Fig. 1: Filter architecture. Fig. 2: Vector Merge carry chain. Fig. 4: Test chip plot (cmos5x) 28 Dec 1995 Fig. 7: SOI filter operating at 564 MHz. Fig. 11: Ring versus Filter speed correlation. Fig. 12: Filter energy efficiency across multiple technologies and wafers.