# **Design Methodology of Low-Power Microprocessors**

## **Toshihiro Hattori**

Hitachi, Ltd. hattori-toshihiro@sic.hitachi.co.jp

Abstract - Low power is one of the most important targets of embedded microprocessor design. To reduce the dynamic power and static power, several design technologies applied in real microprocessors. But the term "low power" is not simple in real application. According to the system low power, many varieties of low power features are required for microprocessors. For example, an application processor for cellular phone was designed using many low-power technologies. Both device / circuits level and architecture / logic level low power technologies are discussed..

#### I Introduction

Low power is one of the most important targets of embedded microprocessor design. However, there is many kind of definition of power consumption of microprocessors. The system witch integrates a microprocessor defines the many scenes of low power. Chapter II describes some definitions of low power. In the Chapter III, an example of low-power microprocessor design is described. This microprocessor is designed for an application processor of cellular phone. In Chapter IV, the architecture / software level low power solutions in the application processor are described. In Chapter V, the key technologies for low power in device / circuit level are listed. Architecture / Logic level low-power technologies are listed in Chapter VI.

#### II. Power Consumption of Microprocessor

Power consumption of microprocessors can be defined in many ways. There are various metrics of power consumption in the real application of microprocessor. Let me list up some metrics of power consumption for the application processor in cellular phone.

### A. Average Power

The most popular definition is average power consumption when a typical application program running on the microprocessor. "mA/MHz" metrics is used in the discussion of the average power consumption. This metrics may correspond to the battery life in portable equipments.

### B. Maximum Power Consumption

The average power consumption drastically changes according to the application software which the processor running. The toggle ratio of data-value in computing program may change more than 30% of average power consumption. The hand set designers want to know the maximum power consumption of microprocessor when they choose the voltage regulator device or when they design temperature distribution inside the handsets.

#### C. Standby Power Consumption

Current microprocessors have the standby mode features. After a program issues a standby instruction, the processor stops the clock distribution inside. In the standby mode, power consumption of microprocessors is only leak current. However, the leak current has been increased as the Vth of deep sub-micron process is lowered. There are many technologies to reduce a leak current, which are applied for low power microprocessors. It is because standby current is one of the most important values in portable equipment. For example, a cellular phone needs powerful cpu power during communication, but in the most of the time, the phone is only waiting calls.

### D. Power Consumption in power off mode

As I mentioned, standby leak current has been increased to the order of mA because low supply voltage require low Vth in deep sub-micron process. Because portable equipment cannot allow mA leak current, another option is to cut power supply into microprocessor. Of course the power consumption of the device that is not supplied power is zero. However, in the real system , even the turn off microprocessor should keep the stable value of external pins to prevent leak current in board level. And some functionality in microprocessor like "real time clock feature" must be active even microprocessor's main power supply is terminated. The realistic power consumption in power off mode can be defined according to the system requirement.

#### E. Memory Retention Power Consumption

Because power off mode require all initial boot procedure

when microprocessors wake up. Some microprocessors support the memory retention power off mode. In this mode, the contents of on-chip memory are kept. Of course, the power supply for memory macro should be kept even the main power supply for random logics in microprocessor is shut down.

### F. Sleep Mode Power Consumption

According to the requirement from the system designers, microprocessor chips had better have various variations of operation mode. For example, cpu clock off & interrupt controller active that is waiting interrupt signal form outside. Or one peripheral IP can be power off when cpu is alive. Therefore the definition of power consumption can be defined by the requirement from the system.

#### G. Multi-target low power

As I mentioned above, we can define many kind of metrics. Several metrics may be important for the system requirements. Therefore "low power" is multi-target which should be achieved simultaneously.



Figure 1 Chip Micrograph of the Application Processor

#### III. Low-power Microprocessor

Fig.1 is the photo of a microprocessor that is designed as an application processor in cellular phone. This application processor is optimized for application processing in 3G cellular phones. The processor consists of an SH3-DSP processor core [1,5], which integrates the SH3-CPU and DSP core seamlessly as described in [2], including a large on-chip SRAM and rich peripherals. The processor enables software-based MPEG-4 encoding/decoding and Java applications execution using the CPU, DSP, and on-chip SRAM effectively. The chip of this processor is fabricated using the 5-metal-layer 0.18- $\mu$ m CMOS technology. The supply voltage is 1.5V and the chip runs at maximum of 133-MHz. The Dhrystone 1.1 performance is 178-MIPS and the peak DSP performance is 133-MMACS (1-MMACS/MHz). The chip area is 6.34 × 6.79-mm<sup>2</sup>. The chip is packaged in 240-pin CSP, and consumes typically 170-mW running at 133-MHz and as most as 10- $\mu$ A in stand-by mode.



Figure 2 Application Processor Block Diagram

Figure 2 shows a block diagram of the application processor in a 3G cellular phone. This processor integrates a SH3-DSP core, a 32-kB 4-way set-associative unified cache with the least-recently-used (LRU) replacement strategy, a 16-kB XYRAM for the DSP feature, a 128-kB user RAM (URAM), a TLB, a DMA controller (DMAC), a bus state controller (BSC), a 1-kB shared SRAM (MFRAM), and other peripheral interfaces. The SH3-DSP core is designed with single-phase clocking methodology, adopting clock gating for lowering power consumption. The single-cycle access, large-capacity URAM mapped on a specific address space improves performance for multimedia applications and reduces power consumption of the I/O part of the processor by lessening external memory accesses. The communication protocol between the base-band processor to the application processor is based on a simple SRAM interface, where MFRAM is shared by the two processors under the control of the multi-functional interface (MFI).



#### IV. Low Power Solution in the Application Processor

Figure 3 shows some examples of the SH3-DSP core instructions. The core has the dual-mode instruction set architecture of 16-bit CPU instructions for high code density and 32-bit DSP instructions for high performance. The CPU and DSP instructions can be mixed in an instruction sequence without any mode change. In DSP instructions, the uppermost six bits indicate a 32-bit code, and the lower bits are a load/store field and a calculation field. Four operations, that is, one ALU operation, one multiplication, and two load/store operations for XYRAM, can be executed in parallel.

Video-attached wireless communication based on MPEG-4 Codec is one of the promising applications of the 3G cellular phones. However, from the point of view of accelerating other multimedia applications execution such as Java, dedicated MPEG-4 Codec LSIs [3,4] are not suitable. processor executes MPEG-4 Codec This (OCIF Simple@L1) by software, which enables to handle execution of other applications as well. The optimized software MPEG-4 encoder assigns the DSP motion estimation, DCT, IDCT, and motion compensation processing, while 37-kB reference and 37-kB reconstructed images of one video object place are allocated to the URAM with part of the encoder software, and the predicted coefficient, the reconstructed coefficient, and the reconstructed macro block, to the XYRAM. The performance improvement in MPEG-4 encoding by using DSP, XYRAM, and URAM is shown in Figure 4, which is of two major factors. Note the only small portion of XYRAM, 4.8-kB of total 16-kB, is occupied.

- (i) Speedup due to DSP employment: The execution time in(b) is 45% shorter than (a) since the DSP exhibits its parallelism within major loops.
- (ii) Speedup due to URAM utilization: The execution time in (c) is 70% shorter than (a) because the external memory access rate decreases.

Thus, this processor reduces the required frequency for 15-frames/sec MPEG-4 encoding as low as 70-MHz, resulting the sufficient headroom of 63-MHz for other software tasks. Measured processor power is 140-mW in spite of software encoding.

### V. Device / Circuits Level Low-Power Technologies

There are some technologies in device / circuits level for low power. A-D listed below are technologies to reduce dynamic power. E-F are to reduce leak power.





#### A. Standard Transistor Size Optimization

Both high-performance and low power are required in current microprocessor design. However, the methods for higher frequency mostly increase power consumption. Larger size of transistors used to be preferred in processor design. But, taking account of both metrics of frequency and power, there is an optimal point of average transistor size in rather small transistor size.

### B. Reduce Junction Capacitance

Not only wiring capacitance, but also junction capacitance is important for power reduction. Cell circuit and layout techniques can reduce the power.

### C. Repeater & Buffer Trees

Instead of usage of large drivability transistors multi-stage repeaters for long-distance signal transfer sometimes reduce power.

#### D. Low Voltage Operation

Low voltage operation is the key for low power because dynamic power is roughly in proportion to square of supply voltage. Circuit techniques to allow low power operation are required.

### E. Control of Substrate Voltage

Not only standby leak current, but also leak current in real operation can be reduced by substrate voltage control

#### F. Power Switch

As I mentioned, final solution for leak current reduction is power off. The technology of power switch in the chip is important.

### VI. Architecture / Logic Level Low-power technologies

There are also some technologies in architecture / logic design level for low power. All listed technologies are to reduce dynamic power. E is also to reduce leak power.

### A. High-performance in low-frequency

Dynamic power is in proportion to operation frequency. If the same application can be achieved in lower frequency, dynamic power is reduced. Architectural improvement like superscalar and wide bit width is one method to reduce dynamic power.

### B. Hardware Accelerator and DSP

For the specific application like MPEG4, the hardwired logic can achieve solution with low frequency. DSP unit is good for multimedia application. Introducing these function IP can reduced the power consumption.

### C. Instruction Set Architecture

Instruction set can be designed for high frequency and for low power. Reducing the switching ratio is target of this design.

### D. Gated Clock

Clock distribution and FF switching is one of the major power consumption in microprocessors. Well-designed gated clock can reduce switching ratio drastically.

### E. Reduce Memory Access

One of the major power consumption in microprocessor is power in memory macro. Introducing of multiple numbers of small memory mat and reducing unnecessary memory activation can reduce the power.

### F. On Chip Memory, On Chip Integration

Current microprocessor operates with low voltage like 1.2V or 1.0V. However, external interfaces from chip are still using 3.3V or 2.5V for board operation. To reduce the IO power is one of the key ideas. The simplest method is to integrate parts inside chip, which can remove IO pins.

### G. Optimization of Sleep Mode

As I mentioned on Chapter II, the sleep / standby / power off mode can be designed according to the system scene. Optimization of sleep mode can reduce the power which user really wants.

### H. Dynamic Control of Supply Voltage

The low supply voltage is key for low power because power consumption is in proportion to square of supply voltage. Dynamic control of supply voltage and operation frequency according to the lower request of CPU power can reduce power drastically.,

### I. Logic Partitioning for Power Switching and Clock Stop

As I mentioned, clock stop for a module and power off a module can reduce the power. Logic hierarchy that accepts power off module by module, should be designed carefully because some key function in the module is needed from the view of system which using microprocessor.

### VII. Conclusions

Low power technologies of microprocessor design are discussed. The most important fact is that the purpose of low power of microprocessor is the low power of the system. If a cellular phone system can use only series regulator to generate microprocessor supply power, the power consumption is not in proportion to square of supply voltage because series regulator consumes the power in order to make lower voltage. Therefore the definition of low power is one of the most important methodologies of low power.

### References

[1] T.Yamada et al., "A Low-Power Embedded RISC Microprocessor with an Integrated DSP for Mobile Applications," to be published in IEICE Transactions on Electronics.

[2] A.Shridhar et al., "Integrating a DSP with a Microcontroller," Proceedings of the Signal Processing Applications Conference at DSPx '96, pp.645-652, March 1996.

[3] T.Hashimoto et at., "A 90mW MPEG4 Video Codec LSI with the Capability for Core Profile," ISSCC Digest of Technical Papers, pp.140-141, Feb. 2001.

[4] T.Nishikawa et al., "A 60MHz 240mW MPEG-4 Video-Phone LSI with 16Mb Embedded DRAM," ISSCC Digest of Technical Papers, pp.230-231, Feb. 2000.

[5] T.Tsunoda et al., "Application Processor for 3G cellular Phones," Proc. COOL Chips V, vol. I, pp.102-111, Apr. 2002.