# Design and Defect Tolerance Beyond CMOS

Xiaobo Sharon Hu\*, Alexander Khitun†, Konstantin K. Likharev‡, Michael T. Niemier\*, Mingqiang Bao†, Kang L Wang† \*Department of Computer Science and Engineering University of Notre Dame, Notre Dame, IN, U.S.A. shu@nd.edu, mniemier@nd.edu <sup>‡</sup> Department of Physics and Astronomy, Stony Brook University Stony Brook, NY, U.S.A. klikharev@notes.cc.sunysb.edu <sup>†</sup>Department of Electrical Engineering University of California Los Angeles, Los Angeles, CA, U.S.A. ahit@ee.ucla.edu, baoming@ee.ucla.edu, wang@ee.ucla.edu

#### ABSTRACT

It is well recognized that novel computational models, devices and technologies are needed in order to sustain the remarkable advancement of CMOS-based VLSI circuits and systems. Regardless of the models, devices and technologies, any enhancement/replacement to CMOS must show significant gains in at least one of the key metrics (including speed, power and cost) for at least a subset of application domains currently employing CMOS circuits. In addition, effective defect tolerant techniques are a critical factor for the successful adoption of any new computing device due to the fact that nano-scale structures will have defect rates much higher than today's CMOS chips. The task of identifying application domains that could benefit the most from a new model/device/technology and ensuring that the resultant system meets functional requirements in the presence of defects requires synergistic efforts of physical scientists, and circuit and system design researchers.

This paper contains a collection of three contributions—each focusing on one particular emergent technology—presenting a basic introduction on the technologies, some of their unique features in contrast with CMOS, potential application domains for these technologies, and new opportunities that they may bring forward in defect tolerance design. The contributions include both traditional and nontraditional state representations which use either electronic or magnetic interactions.

# 1. CMOL AND COUSINS: HYBRID CMOS/NANO CIRCUIT FAQS

#### Konstantin K. Likharev

**Q:** What is CMOL?

A: The basic idea of hybrid CMOS/nanoelectronic circuits is to complement the CMOS stack with a few-layer nanoelectronic addon (Fig. 1a) in the form of a nanowire crossbar (Fig. 1b). This idea may be traced back at least to the pioneering paper by J. Heath *et al.* [19]; however, the authors of that work and several following works in which this concept has been developed (see, e.g., reviews [10, 37, 53]) have assumed the use of relatively complex, three-terminal nanoelectronic devices whose integration is still well beyond reach. The current stage of the hybrid circuit idea development (started in 2003 [33, 39], but having evolved substantially until the late 2005 [34, 37, 59]) is focused on hybrid circuits which do not use any active nanoelectronic components beyond similar, simple (two-terminal), bistable devices (Fig. 1c) formed at each crosspoint simultaneously with the crossbar patterning.

**Q:** What are the main options for crosspoint device implementation? Does the acronym "CMOL" imply using molecular devices?

A: The answer to the latter question is NO. This (admittedly, misleading) term was coined in 2003, when molecular electronics seemed the only option for the implementation of crosspoint devices. By now, two-terminal crosspoint devices with the necessary "latching switch" functionality (Fig. 1c) have been demonstrated using a broad variety of materials and fabrication techniques - see, e.g. Refs. [27, 65] for recent reviews. For most of them, the device-to-device reproducibility (which is, of course, necessary for integration) has not yet been documented; however, there are notable exceptions. For example, I. G. Baek et al. [3] have demonstrated a few-percent reproducibility of the effective ON resistance of metal-oxide-based devices, while A. Chen *et al.* [6] have reported a (still acceptable)  $\sim$ 30% r.m.s. spread of ON currents in copper-oxide-based junctions. Even more promising, J. Billen et al. [4] have achieved  $\sim$ 7% and  $\sim$ 20% r.m.s. scattering of the, respectively, OFF $\rightarrow$ ON and ON $\rightarrow$ OFF switching thresholds in (relatively thick) Cu-TCNQ layers, whereas S. Jo and W. Lu [21] have reported a  $\sim 10\%$  spread of the OFF $\rightarrow$ ON switching voltage in amorphous-Si-based devices.

The apparent bistability mechanism in all these devices is reversible field-induced drift of cations in amorphous oxide matrix, leading to conducting filament formation and dissolution. Preliminary estimates show that this physics may give reproducible devices all the way to  $F_{nano} \sim 10$  nm; after that other materials may become necessary, for example specially designed molecular self-assembled monolayers ("SAMs") with the atomic-reconfiguration [11] or single-electron [15, 33, 39, 43] bistability mechanisms. The recent revolutionary breakthrough [2] in reproducible SAM fabrication gives every hope that these devices may be integrable.

Q: Any other components you need?

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

CODES+ISSS'08, October 19-24, 2008, Atlanta, Georgia, USA.

Copyright 2008 ACM 978-1-60558-470-6/08/10 ...\$5.00.



Figure 1: (a) The general idea of a hybrid CMOS/nanoelectronic circuit, (b) the nanowire-crossbar add-on, and (c) the required I - V curves of the two-terminal crosspoint devices (schematically).

A: The only other key ingredient of the current generation of hybrid circuits is an area-distributed interface between the CMOS stack and nanowire crossbar, using cone-shaped vertical plugs ("pins", see Fig. 1a), instead of peripheral interfaces discussed in earlier publications. A major trick here is the rotation of the crossbar by a certain angle with respect to the interface pin mesh [33, 34, 37, 39], which gives the CMOS subsystem a unique access to each nanowire and each crosspoint nanodevice, even if the crossbar half-pitch  $F_{nano}$  is much less that the CMOS half-pitch  $F_{CMOS}$ .

**Q:** The CMOL concept implies a patterning technology with nanoscale resolution. If such technology is on hand, why not use it for further CMOS scaling, instead of the hybrid circuit fabrication?

A: First, each layer of the nanowire crossbar requires only one simple pattern - a set of parallel lines. Second, the tolerated fluctuations of the pattern dimensions are of the order of  $F_{nano}/3$ , i. e. much larger than those which are required for fabrication of a MOSFET with a minimum feature size of  $F_{nano}$  [1, 33]. Finally, the line patterns do not need to be aligned with either each other or the CMOS subsystem [35]. All these factors allow the use of such advanced patterning methods as nanoimprint (see, e.g., [5, 63]), as well as maskless methods like EUV IL [5, 52] and block-copolymer lithography [5, 18] for crossbar fabrication - see, e. g., recent impressive demonstrations of crossbars with  $F_{nano} \approx 15$  nm [17, 22].

**Q:** OK, the hybrids look relatively simple, but would still require some research and development effort. Could this effort be justified? How much advantage in system performance can the hybrid circuits provide?

**A:** This issue has been addressed in several recent studies [16, 28–30, 38, 41, 42, 54–59, 62] in which the following systems have been explored:

(i) CMOL memories (which are just a hybrid-circuit extension of resistive memories [36, 44], with each bit stored in the internal state of a certain crosspoint device - see Fig. 1c, but peripheral functions embodied in the CMOS subsystem), may have the effective bit area close to  $4F_{nano}^2$  [57], eventually enabling terabit-scale integration [38].

(ii) CMOL reconfigurable (cell-FPGA-like) logic circuits [54, 56, 59] may provide a density advantage of about 2 orders of magnitude over purely CMOS circuits of the same functionality,  $F_{CMOS}$  and power density, at comparable speed.

(iii) Though custom CMOL VLSI circuits have not been ex-

plored to any detail yet, there are preliminary indications [58] that these circuits will have a lower advantage in density, but substantially increased speed (again, at the same power).

(iv) Mixed-signal neuromorphic CMOL networks ("CrossNets" [15, 16, 31, 39, 62]) may provide extremely high performance for certain advanced information processing tasks such as pattern classification (including ultrafast feature recognition [28]), and more intelligent tasks, in particular those requiring in-situ training [29, 30] and global reinforcement learning [41]. While today such "cognitive" tasks may be considered niche applications, there is a good chance that in future they will form a new, fast growing IT market.

**Q:** Even after an additional effort, the crosspoint nanodevices may not be 100% perfect. How defect-tolerant are CMOL circuits?

A: So far, only one defect type (equivalent to a stuck-at-open fault) has been explored in detail. To such defects, properly designed CMOL circuits are very tolerant, allowing  $\sim 10\%$  of bad devices for memories [57],  $\sim 20\%$  for FPGA-like logic [54, 56, 59], and more than 30% for some neuromorphic circuits [30]. However, sensitivity to other types of defects (e.g., stuck-at-closed faults or nanowire breaks) may be higher, and this issue has to be explored in more detail.

**Q:** Your title mentions CMOL "cousins". What exactly are their differences from the generic CMOL circuits, and what advantages and handicaps they may have?

A: Most notably:

(i) G. Snider and R. S. Willams have suggested [51] a simplified version of CMOL circuits, dubbed FPNI, in which more space is provided for the CMOS/crossbar interface, and crosspoint devices are stripped of their role in logic, i.e. restricted to the reconfiguration function. These circuits are easier for implementation, and may be useful at the initial stage of hybrid circuit development, but have a factor of  $\sim$ 3 lower density.

(ii) On the contrary, the "3D CMOL" circuits suggested by W. Wang's group [61] allow a two-fold increase of density in comparison with the original ("2D") CMOL. Such "3D CMOL" circuit is actually a system of two CMOS chips bonded around a single nanowire crossbar. One more additional benefit of such circuits is that their component chips can be planarized at all levels, while the original CMOL circuits cannot be planarized at the lower pin level. (This fact does not prevent a plausible flow of their fabrication [35].) One of challenges for the 3D CMOL implementation is whether chips may be made sufficiently planar for nanoscale bonding, at acceptable cost.

**Q:** What are the CMOL scaling limits?

A: Apparently, the most fundamental limit to CMOL scaling is quantum-mechanical tunneling between parallel nanowires of the crossbar. Theoretical estimates show that the corresponding leakage current becomes a forbidding challenge at  $F_{nano} \sim 2$  nm (for air gaps or very wide bandgap insulators). However, due to unavoidable gap width fluctuations, the practical limit is probably closer to 3 nm. Moreover, at the approach to this frontier, several other problems become very serious, including:

(i) nanowire resistance growth due to electron scattering on grain boundaries,

(ii) interconnect pin sharpness and position uncertainty, and

(iii) variations of the crosspoint device cross-section area.

This allows me to believe that  $F_{nano} \sim 3$  nm is the natural scaling limit for CMOS/nanoelectronic hybrids.

**Q:** Any summary?

A: The practical introduction of CMOS/nano hybrids would preserve (and further develop) all the huge technological and design infrastructure of semiconductor IC industry, whereas enabling an extension of the Moore's Law by estimated 10 to 15 years beyond the "red brick wall" faced by the evolutionary CMOS circuits [38]. Several challenges are still to be met before the industrial fabrication of the hybrid circuits [35], but they seem substantially less serious than those faced by any other post-CMOS integrated circuit technology concepts. For the digital circuit design community, the main current challenge is a thorough simulation of several representative CMOL ASICs (in order to quantify their possible advantage over CMOS circuits with the same functionality), and a detailed study of their tolerance to a broad set of fabrication defects. (Both tasks will certainly require a more complex CMOL circuit design tools.) My advice to analog circuit designers is to have a good look at the enormous prospects offered, especially in the long run, by neuromorphic CMOL networks [31,62]

Useful discussions of the issues considered in this paper with P. Allen, J. Barhen, S. Das, A. DeHon, P. Franzon, D. Hammerstrom, R. Karri, R. Kiehl, P. Kuekes, J. H. Lee, X. Liu, J. Lukens, X. Ma, A. Mayr, V. Patel, N. Simonian, G. Snider, M. Stan, D. Stewart, D. Strukov, Z. Tan, W. Wang, R. Waser, and R. S. Williams are gratefully acknowledged. The research work on CMOL at Stony Brook was supported in part by AFOSR, DoD, FCRP (via FENA Center), and NSF.

### 2. MAGNETIC QCA

Michael T. Niemier, X. Sharon Hu

#### 2.1 Introduction

Magnetic logic based on coupled ferrite cores was originally pursued in the 1950s, but was eventually replaced by semi-conductor chips. The lithographically-defined nanomagnets that form the basis of this work (i) do not possess the disadvantages of the early, bulky, ferrite core magnets, and (ii) can be arranged to form circuits within the quantum-dot cellular automata (QCA) architecture scheme [20]. The initial description of a QCA device called for encoding binary numbers into cells that have a bi-stable charge configuration. A QCA cell would consist of 2 or 4 "charge containers" (i.e. quantum dots) and 1 or 2 excess charges respectively. One configuration of charge represents a binary '1' and the other a binary '0' [32]. Logical operations and data movement are accomplished via Coulomb (or nearestneighbor) interactions. QCA cells interact because the charge configuration of one cell alters the charge configuration of the next cell. In a magnetic implementation of QCA (MQCA), charge configurations are replaced with magnetic polarizations.

For MQCA, wires, gates, and inverters have all been experimentally realized, they operate at room temperature [20], and [8] estimates that if  $10^{10}$  magnets switch  $10^8$  times/second, they would only dissipate about 0.1 W of power. When the drive circuitry is included, [46] predicts that circuits could provide performance wins over state-of-the-art, low power CMOS when considering energy delay product<sup>1</sup>. Devices can scale and remain non-volatile provided their size/shape remains above the superparamagnetic limit. However, binary state in nanomagnets with feature sizes below the superparamagnetic limit can be stable for around 1 ms [66] – long enough to perform logical operations. Scaling can also decrease switching times [66].

Application spaces could be abundant as MQCA devices should be low power and non-volatile, and any application that has these performance requirements might benefit. Patterned thin-film nanomagnets are also similar in nature and compatible with the processing of MRAM devices. For MRAM technology, the physical coupling between neighboring magnetic bits is undesirable, but we attempt to



Figure 2: Cartoon representations of (a) a wire segment and (b) a majority gate. Wire segments have been experimentally demonstrated (c) as have majority gates (d).

use it to our advantage for MQCA. Moreover, the problem of setting or reading a magnetic bit is similar for MRAM and MQCA: in both cases the magnetization state of a nanometer-size thin-film island has to be written and read. In other words, we will be able to capitalize on advances made in magnetic data technology to address input and output with MQCA.

Still, like any device with nanometer feature sizes, MQCA based circuits could suffer from defect rates that are much higher than those for CMOS-based circuits. Thus, an MQCA-based circuit must not only perform better for some computational task of interest (to justify a technology transition), but realistically will need to do so with more faulty components. Fabrication processes envisioned for MQCA are similar to those for CMOS and fabrication variations should be similar as well. However, because MQCA devices process information in different ways than CMOS devices, defect tolerance mechanisms will be different. We study these issues here.

#### 2.2 Background

Figs. 2a-b illustrate two important building blocks that would be used to construct MQCA circuits. A wire (Fig. 2a) is just a line of magnets that are antiferromagnetically coupled with each other. The basic logic gate in MQCA is based on the majority voting function. By setting one input of a majority gate to a logic '0' or '1', the gate will execute an AND or OR function respectively. In MQCA, the gate performs an *inverting* majority gate function (Fig. 2b). These structures have all been experimentally demonstrated at room temperature (see Fig. 2c,d [20]).

The structures illustrated in Fig. 2c,d were tested with a clock that took the form of a periodically oscillating *external* magnetic field that drove a system to an initial state, and then controlled the relaxation of the said system to a ground state. For example, a line of nanomagnets would begin in a logically correct, antiferromagnetically coupled ground state. An external field turns the magnetic moments of all magnets horizontally into a neutral logic state against the preferred magnetic anisotropy (i.e. along a magnet's hard axis). This is an unstable state of the system, and when the field is removed, the nano-magnets relax into a new antiferromagnetically ordered ground state in accordance with the new input. [46] explored the use of copper wires wrapped by ferrite on the sides and bottom to provide *local* control of MQCA-based circuits. Nanomagnets would reside on the wire surface.

#### **2.3 Fault Tolerance**

Per the discussion in Sec. 2.1, mechanisms for fault tolerance that do not adversely effect system-level performance are essential. Here, we discuss 3 ways to provide it for MQCA.

#### 2.3.1 At the Circuit Level

Electron beam lithography (EBL) – used to fabricate the magnets shown in Fig. 2 – can lead to fabrication variations such as bugles (where a magnet's aspect ratio is smaller than intended), edge roughness, and to missing magnetic material (an edge or corner of a magnet

<sup>&</sup>lt;sup>1</sup>While magnetic switching times are expected to be on the order of 50-100 ps [20], extremely low switching energies could lead to competitive EDPs.



Figure 3:  $H_{clock}$  vs.  $M_y$  for a 60x90 nm magnet with no fabrication variation and a 60x90 nm magnet with a "slanted" edge.  $H_{clock}$  required to null the slanted magnet is greater than that for the perfect magnet.

is a common location). Of particular interest is how these fabrication variations affect logical correctness – or more specifically, how the magnetization (binary state) associated with a previous computation is "removed". This process is essential as it allows the nanomagnets that make up MQCA circuit elements to be re-evaluated with new inputs as discussed above. We can begin to answer this question by leveraging the OOMMF simulation suite [12]

As an example, we consider a magnet with material missing from one of its corners as well as a magnet with no fabrication variation. Of interest is the magnitude of the external field (referred to as  $H_{clock}$ ) required to "null" each magnet so that we can tip it to the opposite polarization by leveraging a local biasing field. (The same biasing field was used for all three simulations). Results are illustrated in Fig. 3. We consider the down-to-up transition of the misshapen magnet first (see middle inset in Fig. 3). Note that a stronger external field is required to null this magnet (approximately  $0.6 \times 10^5$  A/m instead of  $0.5 \times 10^5$  A/m for the non-misshapen magnet). Magnetic moments tend to align along a magnet's edge. In this simulation, the placement of the slant and the direction of the applied external field help to reinforce the initial downward polarization  $(\downarrow)$ . For this same reason, the up-to-down transition (see top inset in Fig. 3) can be accomplished when the magnitude of  $H_{clock}$  is lower (approximately  $0.4 \times 10^5$  A/m). (Only the first portion of this curve is shown – i.e. until the magnet is nulled - to improve graph readability.)

While the above suggests that increasing the current in a clock wire (and hence the magnitude of  $H_{clock}$ ) can help to ensure logical correctness at the expense of energy efficiency, we still have to carefully consider how the external field is used to control logic gates constructed with nanomagnets. Previous work has shown that the state of a stuck at fault (for example) is determined by the previous state of a group of magnets, the location of missing material and the direction of  $H_{clock}$  – and can change based on what inputs are applied [47].

#### 2.3.2 At the Architectural Level

The most widely used mechanism for post-fabrication fault tolerance comes at the architectural-level in the form of reconfigurable logic. We can apply this lever in MQCA-based systems as well. For example, the PLA structure in [7] can be expanded to include more rows and columns such that defective crosspoints and/or interconnect can be avoided – increasing the probability that the desired set of logic functions can be mapped onto the faulty PLA. However, for MQCA, a larger PLA not only means a larger chip area, but also more/longer clocking wires to control the logic and interconnect associated with it. Therefore, redundancy in an MQCA PLA provides a way to trade power consumption for fault tolerance.

Consider the yield vs. fault rate study presented in [7]. This study indicates that a yield of 90% is possible given a fault rate of  $10^{-3}$  and 10% redundancy. However, if the fault rate increases to  $10^{-2}$ , 400% redundancy is required. As seen in Sec. 2.3.1 increasing the magnitude of  $H_{clock}$  provides another level of flexibility to circuit designers in terms of fault tolerance. However, increasing the magnitude of  $H_{clock}$  can cause power to quadruple. Thus, from the standpoint of performance and logical correctness, it is an interesting optimization problems to determine the most effective usage of the above mechanisms for fault tolerance. Together these techniques could allow for tolerance of a higher fault rate than can be achieved by either individually. However, one technique might be sufficient to provide the fault tolerance required to achieve a desired yield with the smallest increase in power.

#### 2.3.3 At the Device Level

As seen so far, faults can result from the processes used to make a magnet with a particular shape – which is very much a function of various types of lithography. In Sec. 2.3.1, all simulations assumed nanomagnets made from supermalloy. However, other magnetic materials can also be used as well. For example, [20] considered magnets made with permalloy – a magnetic material with a higher saturation magnetization ( $860 \times 10^3$  A/m versus  $800 \times 10^3$  A/m for supermalloy [12], [9]). An advantage of the higher saturation magnetization is that a magnet can be considered to be a stronger driver (i.e. a stronger '1' or '0') which provides more local control over a potentially defective device. However, a higher saturation magnetization can also make a device harder to null (greater  $H_{clock}$  neeeded) which will increase the overall system energy.

Like the material it is made from, a magnet's shape might also help to facilitate the implementation of robust circuit constructs (i.e. a different shape might increase the flux density in another part of the design). While obviously material and shape parameters cannot be changed post fabrication, they represent other levers that one can use to increase the probability of realizing an efficient and logically system.

#### 2.4 Discussions

To date, our research efforts have focused on ensuring that all of the components necessary for a computationally interesting, physically realizable system are in fact viable (e.g. means for crossing signals with nanomagnets, moving data between adjacent clock wire groups, etc.). We are now in the position to explore each of the aforementioned items in more detail to determine whether or not physically realized structures can be both logically correct and simultaneously offer performance wins over the state of the art in CMOS. A fundamental step in this process is "looking up" to the application-level to ensure that the more detailed solutions map well to the dataflow/performance requirements for a subset of computational tasks and is the subject of ongoing work.

The authors gratefully acknowledge the support of the NSF under grant numbers CCF06-21990, CCF05-41324, and CCF07-02705, as well as the SRC NRI funded MIND center.



Figure 4: Schematics of the integrated electronic-spin-wave circuit. The spin wave circuit receives information in the form of voltage pulses, converts them into spin wave signals, makes computation using spin waves, and provides the output in the form of the voltage pulses.

## 3. MAGNETIC CIRCUITS WITH SPIN WAVE BUS FOR DATA PROCESSING Alexander Khitun, Mingqiang Bao, Kang L Wang

#### 3.1 Introduction

As the perfection of the Complementary Metal Oxide Semiconductor (CMOS) devices is rapidly coming to its end due to the major challenges associated with power dissipation and manufacturing complexities, there is a great deal of practical interest to the implementation of novel nanometer scale devices and novel architectures able to provide a route to further information processing rate enhancement. Spintronics is one of the possible approaches aimed to exploit electron spin rather than electron charge as an information carrier [48]. The information transmission among the spin-based devices may be done naturally through quantum mechanical interactions such as spin waves.

Spin wave is a collective oscillation of spins in an ordered spin lattice around the direction of magnetization. The phenomenon is similar to the lattice vibration, where atoms oscillate around their equilibrium position. In our preceding works [24,25,64], we have developed the general concept of logic circuits with Spin Wave Bus - a ferromagnetic waveguide that can be used as a conduit for spin wave propagation. There are several distinct features and key advantage of using spin waves: (i) information transmission is accomplished without electron transport; (ii) a bit of information can be encoded into the phase of the propagating spin wave; (iii) a number of spin waves with different frequencies can be simultaneously transmitted through the bus; (iv) the coherence length of the spin wave at room temperature may exceed tens of microns, which makes possible to utilize spin wave interference to achieve logic functionality; (v) interactions between spin waves and outside devices can be done in a wireless manner, via a magnetic field.

# 3.2 Logic devices utilizing spin wave interference

In Fig. 4, it is schematically shown an integrated electro-magnetic logic circuit. It consists of the voltage-to spin wave converters, ferromagnetic waveguide structure, spin wave amplifier, and spin wave  $\tilde{U}$ to voltage converter. The input data are received in the form of voltage pulses (i.e. the input signal amplitudes of +1V and -1V correspond to the logic states 1 and 0, respectively.) Next, the input information is encoded into the phase of the spin wave. The conversion of the voltage signal into the spin wave phase can be accomplished by the microstrips. Depending on the polarity of the input signal, the initial phase of each spin wave may have a relative phase difference of e.g.  $\pi$ . Phases of "0" and " $\pi$ " are used to represent two logic states 1 and 0. The spin waves propagate in the ferromagnetic waveguide structure designed to performed useful logic functions. An example of the three-input Majority logic gate is shown in Fig. 4. Depending on the relative phase of the spin waves, the amplitude of the resultant wave can be enhanced or decreased, as a function of the number of waves coming in- phase and out of phase. Finally, the result of the computation is converted in the voltage pulse, and may be amplified by conventional MOSFET to provide the compatibility with the external circuits. By controlling the relative phases of the spin wave signals, it is possible to realize different logic gates such as AND, OR, and NOT in one structure. More detailed description of spin wave-based devices is given in [23].

A first working spin-wave based logic circuit has been experimentally demonstrated by M. Kostylev et al. [26]. The prototype device was built on the base of a Mach- Zehnder-type spin-wave interferometer, where the relative phases of two spin wave signals were controlled via the external magnetic field. The feasibility of a spin-wave based NOT gate has been demonstrated experimentally. Spin-wave logic exclusive-not-OR and not-AND gates based on the same structure have been also realized [50]. In our recent work [14], we presented another example of working spin wave device, where the relative phases of the spin wave signals are controlled by the direction of the excitation current. Our experimental results have shown that spin-wave devices exploiting spin wave interference may be scaled to micrometer and nanometer scales.

# **3.3** Architectures with spin wave buses: advantages and shortcomings

The implementation of the spin wave-based devices will require special architecture solutions to benefit from the wave nature of the magnetic waves. Majority gate shown in Fig. 4 is an example of efficient construction of logic gate exploiting spin wave interference. A large number of spin wave of same frequency can be combined in a waveguide structure. The waves coming in-phase interfere in a constructive manner, and wave coming out of phase cancel each other. The phase of the output signal corresponds to the majority of signals coming in phase. In general, Majority logic is more powerful for implementing a given digital function with a smaller number of logic gates than CMOS [45]. For example, the full adder may be constructed with three majority gates and two inverters (3 magneto-electric cells and 2 modulators) [13]. In contrast, a Boolean-based implementation requires a larger circuit with seven or eight gate elements (about 25–30 MOSFETs) [60].

Another advantage to be used at architecture level is the ability to transmit and process a number of signals in one structure at the same time. Spin waves of different frequencies can be excited in transmitted through the spin wave bus, where each frequency can be used as an information channel. The experimental data on the excitation and detection of the spin waves of different frequencies in nanometer thick ferromagnetic film were presented in [13], and an example of the multi-bit processor comprising converters, modulators and magnetoelectric cells arranged the spin wave bus is described in [23]. As pointed out by T. Roska [49], there are some computational algorithms, for example, those for image processing and speech recognition, that can be implemented more time efficient using waves rather than digital signals. As mentioned above, spin waves of different frequencies can simultaneously excited, transmitted and modulated in the same structure, resulting in the possibility of multi-bit parallel processing. For example, image processing function labeling can be done efficiently with O(log N) time for any given  $N \times N$  image using spin wave architecture, as compared with CMOS with O(N) [40].

The defect tolerance of the spin wave-based devices is defined by the wavelength of the spin wave. The signal in the spin wave bus is immune to any imperfection, which characteristic is much less than the wavelength. The width and the thickness of the spin waveguides can be scaled down to several nanometers. However, there is a tradeoff between scalability and defect tolerance. To scale down the length of the logic gate, one needs to decrease the wavelength. At the same time, the shorter wavelength signal becomes more sensitive for structure imperfections. A wavelength of 100nm can be taken as a benchmark, while the optimum value has to be found by taking into consideration a particular material structure.

There are certain shortcomings associated with the use of spin waves: (i) relatively low group velocity ( $\overline{1}07$ cm/s), and (ii) short decay time (about 1ns) for propagating spin wave at room temperature. Spin wave dispersion depends on the waveguide geometry, the strength of the bias magnetic field, and varies for different spin wave modes. However, in the best scenario, spin wave signal is three orders of magnitude slower and than the photons in silica or electromagnetic wave in a copper coaxial cable. These disadvantages may be partially compensated by short (submicron) propagation distances. The time delay per logic gate can be estimated in the range of 0.1–1.0 ns, and the maximum propagation length without amplification is restricted by 5-10  $\mu$ m at room temperature.

#### 3.4 Summary

In conclusion, the utilization of spin waves offers an original way of implementing quantum-mechanical phenomena for information transmission and processing. The main advantage of the proposed approach lies in the ability of constructing logic gates with less number of devices than it required by using CMOS. There are disadvantages inherent to spin wave-based logic devices, which are low propagation speed and high attenuation. In spite of these disadvantages, magnetic logic circuits may provide a substantial throughput enhancement at the same or less level of power consumption in comparison to CMOS-based circuits. Potentially, magnetic circuits with spin wave buses may find applications as an interface between electron-based and spin-based logic circuits and as a logic blocks for general-purpose computing and special information processing tasks. These points are summarized in Table 1.

|  | Table 1 | : C | compariing | spin-wave | with | CMOS | based l | logic devices. |
|--|---------|-----|------------|-----------|------|------|---------|----------------|
|--|---------|-----|------------|-----------|------|------|---------|----------------|

| Advantages                | Disadvantages          |
|---------------------------|------------------------|
| Multi-bit transmission    | Low signal             |
| and processing            | propagation speed      |
| Scalability               | Fast spin wave damping |
| Logic circuits with fewer |                        |
| number of components      |                        |
| Compatibility with        |                        |
| CMOS technology           |                        |

Acknowledgement: The work was supported in part by the Focus Center Research Program (FCRP director: Dr. Betsy Weitzman) Center of Functional Engineered Nano Architectonics (FENA), and by the Nanoelectronics Research Initiative (NRI Director: Dr. Jeff Welser) - The Western Institute of Nanoelectronics (WIN).

### 4. REFERENCES

 International Technology Roadmap for Semiconductors. 2007 Edition. 2007. available online at http://www.itrs.net/Links/2007ITRS/Home2007.htm.

- [2] H. B. Akkerman *et al.* Towards molecular electronics with large-area molecular junctions. *Nature*, 441:69–72, Feb. 2007.
- [3] I. G. Baek *et al.* Multi-layer cross-point binary oxide resistive memory (OxRRAM) for post-NAND storage applications. In *Tech. Dig. IEDM'05*, pages 750–753, 2005.
- [4] J. Billen *et al.* Improved CuTCNQ resistive non-volatile memories and a statistical study on their threshold voltage. In *Proc. ICMTD*'07, pages 135–137, 2007.
- [5] D. Bratton *et al.* Recent progress in high resolution lithography. *Polymers for Adv. Technol.*, 17:94–103, Feb. 2006.
- [6] A. Chen *et al.* Non-volatile resistive switching for advanced memory applications. In *Tech. Dig. IEDM*'05, page 31.4, 2005.
- [7] M. Crocker, X. S. Hu, and M. Niemier. Fault models and yield analysis for QCA-based PLAs. *Int. Sym. on FPL*, pages 435–440, 2007.
- [8] G. Csaba, P. Lugli, A. Csurgay, and W. Porod. Simulation of power gain and dissipation in field-coupled nanomagnet. J. of Comp. Elec., 4(1-2), 2005.
- [9] N. Dao, S. Whittenburg, and R. Cowburn. Micromagnetics simulation of deep-submicron supermalloy disks. *J. of Appl. Phys.*, 90(10):5235–7, 2001.
- [10] A. DeHon and K. K. Likharev. Hybrid CMOS / nanoelectronic digital circuits. In *Proc. ICCAD*'05, pages 375–382, 2005.
- [11] W. R. Dichtel *et al.* Designing bistable [2]rotaxanes for molecular electronic devices. *Phil. Trans. R. Soc. A*, 365:1607–1625, 2007.
- [12] M. Donahue and D. Porter. OOMMF user's guide, version 1.0, interagency report NISTIR 6367. http://math.nist.gov/oommf.
- [13] A. K. et al. Inductively coupled circuits with spin wave bus for information processing. Journal of Nanoelectronics and Optoelectronics, 3:24–34, 2008.
- [14] A. K. et al. Logic devices with spin wave buses an approach to scalable magneto-electric circuitry. Proceedings of the Material Research Society, 2008.
- [15] S. Fölling, Ö. Türel, and K. K. Likharev. Single-electron latching switches as nanoscale synapses. In *Proc. IJCNN'01*, pages 216–221, 2001.
- [16] C. J. Gao and D. Hammerstrom. Cortical models onto CMOL and CMOS - Architectures and performance/price. *IEEE Trans. on Circ. Syst.*, 54:2502–2515, Nov. 2007.
- [17] J. E. Green *et al.* A 160-kilobit molecular electronic memory patterned at 10<sup>11</sup> bits per square centimetre. *Nature*, 445:414–417, Jan. 2007.
- [18] I. W. Hamley. Nanostructure fabrication using block copolymers. *Nanotechnology*, 14:R39–R54, Oct. 2003.
- [19] J. H. Heath *et al.* A defect-tolerant computer architecture: Opportunities for nanotechnology. *Science*, 280:1716–1721, Jun. 1998.
- [20] A. Imre, G. Csaba, L. Ji, A. Orlov, G. Bernstein, and W. Porod. Majority logic gate for Magnetic Quantum-dot Cellular Automata. *Science*, 311 no. 5758:205–208, January 13, 2006.
- [21] S. H. Jo and W. Lu. CMOS compatible nanoscale nonvolatile switching memory. *Nano Lett.*, 8:392–397, Jan. 2008.
- [22] G.-Y. Jung *et al.* Circuit fabrication at 17 nm half-pitch by nanoimprint lithography. *Nano Lett.*, 6:351–354, Mar. 2006.
- [23] A. Khitun, M. Bao, and K. Wang. Spin wave magnetic nano-fabric: a new approach to spin-based logic circuitry. *IEEE Transactions on Magnetics*, 2008.
- [24] A. Khitun and K. Wang. Nano scale computational architectures with spin wave bus. *Superlattices & Microstructures*, 38:184–200, 2005.

- [25] A. Khitun and K. L. Wang. Nano logic circuits with spin wave bus. International Conference on Information Technology: New Generation, page 6, 2006.
- [26] M. P. Kostylev, A. A. Serga, T. Schneider, B. Leven, and B. Hillebrands. Spin-wave logical gates. *Applied Physics Letters*, 87:153501–1–3, 2005.
- [27] M. N. Kozicki. Nanoscale memory elements based on solid-state electrolytes. *IEEE Trans. on Nanotechnology*, 4:331–338, May 2005.
- [28] J. H. Lee and K. K. Likharev. Crossnets as pattern classifiers. *Lecture Notes in Computer Science*, 3512:446–454, 2005.
- [29] J. H. Lee and K. K. Likharev. In situ training of CMOL CrossNets. In Proc. WCCI/IJCNN'06, pages 5026–5024, 2006.
- [30] J. H. Lee and K. K. Likharev. Defect-tolerant nanoelectronic pattern classifiers. *Int. J. Circ. Theory App.*, 35:239–4, 2007.
- [31] J. H. Lee, X. Ma, and K. K. Likharev. CMOL Crossnets: Possible neuromorphic nanoelectronic circuits. In Y. Weiss *et al.*, editor, *Advances in Neural Information Processing Systems*, volume 18, pages 755–762. MIT Press, 2006.
- [32] C. Lent and P. Tougaw. A device architecture for computing with quantum dots. *Proc. of the IEEE*, 85:541, 1997.
- [33] K. K. Likharev. Electronics below 10 nm. In J. Greer *et al.*, editor, *Nano and Giga Challenges in Microelectronics*, pages 27–68. Elsevier, Amsterdam, 2003.
- [34] K. K. Likharev. CMOL: A silicon-based bottom-up approach to nanoelectronics. *Interface*, 14:43–45, May 2005.
- [35] K. K. Likharev. CMOL: Freeing advanced lithography from the alignment accuracy burden. J. Vac. Sci. Technol. B, 25:2531–2536, Nov. 2007.
- [36] K. K. Likharev. Resistive and hybrid CMOS/nanodevice memories. In J. E. Brewer and M. Gill, editors, *Nonvolatile Memory Technologies with Emphasis on Flash*, pages 696–703. IEEE Press, Hoboken, NJ, 2008.
- [37] K. K. Likharev and D. B. Strukov. CMOL: Devices, circuits, and architectures. In G. Cuniberti *et al.*, editor, *Introducing Molecular Electronics*, pages 447–477. Springer, Berlin, 2005.
- [38] K. K. Likharev and D. B. Strukov. Prospects for the development of digital CMOL circuits. In *Proc. NanoArch'07*, pages 109–116, 2007.
- [39] K. K. Likharev *et al.* CrossNets High-performance neuromorphic architectures for CMOL circuits. *Ann. NY Acad. Sci.*, 1006:15–58, 2003.
- [40] A. K. M. M. Eshaghian-Wilner, S. Navab and K. L. Wang. Constant-time image processing on spin- wave nano-architectures. *Physica Status Solidi*, 2007.
- [41] X. Ma and K. K. Likharev. Global reinforcement learning in stochastic neural networks. *IEEE Trans. on Neural Networks*, 18:573–577, Mar. 2007.
- [42] M. Masssoumi *et al.* Design and evaluation of basic standard encryption algorithm modules using nanosized complementary metal-oxide-semiconductor-molecular circuits. *Nanotechnology*, 17:89–99, Jan. 2005.
- [43] A. Mayr *et al.* Synthesis of oligo(phenyleneethynylene)s containing central pyromellitdiimide or naphthalenediimide groups and bearing terminal isocyanide groups: molecular components for single-electron transistors. *Tetrahedron*, 63:8206–8217, Aug. 2007.
- [44] G. I. Meijer. Who wins the nonvolatile memory race? Science, 319:1625–1626, Mar. 2008.
- [45] A. R. Meo. Majority gate networks. *IEEE Transactions on Electronic Computers*, EC-15:606–18, 1966.

- [46] M. Niemier, M. Alam, X. S. Hu, G. Bernstein, W. Porod, M. Putney, and J. DeAngelis. Clocking structures and power analysis for nanomagnet-based logic devices. *ISLPED*, pages 26–31, 2007.
- [47] M. Niemier, M. Crocker, and X. Hu. Fabrication variations and defect tolerance for nanomagnet-based QCA. *IEEE Int. Symp.* on Defect and Fault Tolerance in VLSI Sys., Oct. 1-3, 2008.
- [48] G. A. Prinz. Magnetoelectronics. Science, 282:1660–63, 1998.
- [49] T. Roska. Analogic wave computers-wave-type algorithms: canonical description, computer classes, and computational complexity. *IEEE International Symposium on Circuits and Systems*, 2:41–4, 2001.
- [50] T. Schneider, A. A. Serga, B. Leven, B. Hillebrands, R. L. Stamps, and M. P. Kostylev. Realization of spin-wave logic gates. *Appl. Phys. Lett.*, 92:022505–3, 2008.
- [51] G. S. Snider and R. S. Williams. Nano/CMOS architectures using a field-programmable nanowire interconnect. *Nanotechnology*, 18, Jan. 2007. art. 035204.
- [52] H. H. Solak. Nanolithography with coherent extreme ultraviolet light. J. Phys. D, 39:R171–R178, May 2006.
- [53] M. R. Stan *et al.* Molecular electronics. *Proc. IEEE*, 91:1940–1957, Nov. 2003.
- [54] D. B. Strukov and K. K. Likharev. CMOL FPGA: A cell-based, reconfigurable architecture for hybrid digital circuits using two-terminal nanodevices. *Nanotechnology*, 16:888–900, Jun. 2005.
- [55] D. B. Strukov and K. K. Likharev. Prospects for terabit-scale nanoelectronic memories. *Nanotechnology*, 16:137–148, Jan. 2005.
- [56] D. B. Strukov and K. K. Likharev. CMOL FPGA circuits. In Proc. CDES'06, pages 213–219, 2006.
- [57] D. B. Strukov and K. K. Likharev. Defect-tolerant architectures for nanoelectronic crossbar memories. J. Nanoscience and Nanotechnology, 7:151–167, Jan. 2007.
- [58] D. B. Strukov and K. K. Likharev. Reconfigurable hybrid CMOS/nanodevice circuits for image processing. *IEEE Trans.* on Nanotechnology, 6:696–710, Nov. 2007.
- [59] D. B. Strukov and K. K. Lkharev. A reconfigurable architecture for hybrid CMOS/nanodevice circuits. In *Proc. FPGA'06*, pages 131–140, 2006.
- [60] T. F. T. Oya, T. Asai and Y. Amemiya. A majority-logic device using an irreversible single- electron box. *IEEE Transactions* on Nanotechnology, 2:15–22, 2003.
- [61] D. Tu *et al.* Three-dimensional CMOL: Three-dimensional integration of CMOS/nanomaterial hybrid digital circuits. *Micro Nano Lett.*, 2:40–45, Jun. 2007.
- [62] Ö. Türel *et al.* Neuromorphic architectures for nanoelectronic circuits. *Int. J. Circ. Theory App.*, 32:277–302, Sep.-Oct. 2004.
- [63] D. J. Wagner and A. H. Jayatissa. Nanoimprint lithography: Review of aspects and applications. *Proc. SPIE*, 6002:136–144, Nov. 2005.
- [64] K. L. Wang, A. Khitun, and A. H. Flood. Interconnects for nanoelectronics. *IEEE 2005 International Interconnect Technology Conference*, pages 231–233, 2005.
- [65] R. Waser and M. Aono. Nanoionics-based resistive switching memories. *Nature Materials*, 6:833–840, Nov. 2007.
- [66] X. Wu, C. Liu, L.Li, P. Jones, R. Chantrell, and D. Weller. Nonmagnetic shell in surfactant-coated FePt nanoparticles. J. Appl. Phys., 95:6810–6812, 2004.