# A High Level SoC Power Estimation Based on IP Modeling

David Elléouet<sup>1</sup>, Nathalie Julien<sup>2</sup>, Dominique Houzet<sup>1</sup>

<sup>1</sup>Laboratoire I.E.T.R UMR CNRS 6164 Institut National des Sciences Appliquées 35043 RENNES Cédex France david.elleouet@ens.insa-rennes.fr

<sup>2</sup>Laboratoire L.E.S.T.E.R FRE CNRS 2734 Université de Bretagne Sud 56321 Lorient Cédex France nathalie.julien@univ-ubs.fr

### Abstract

Current electronic system design requires to be concerned with power consumption consideration. However, in a lot of design tools, the application power consumption budget is estimated after RTL synthesis. We propose in this article a methodology based on measurements which allows to model the application power consumption with architectural and algorithmic parameters. So, the modeled applications can be added in a library in order to help the system designer to determine early in the design flow the best adequacy between high performances and low power consumption.

## **1 INTRODUCTION**

Electronic systems continually progress, becoming more and more complex, fast, powerful and power consuming. Indeed, the transistor miniaturization dramatically increases the power consumed by a whole chip [5]. The main consequences of this trend are the addition of cooling circuits and the battery lifetime reduction for embedded systems. Like for timing and die area, the power consumption becomes a critical constraint for electronic system designers. As shown in Figure 1, usual system power estimation is obtained after design place and route (2). All design optimizations at this level are time consuming and are not always obvious. Moreover, this estimation is not reusable to design a new system. Such as demonstrated [4], to improve the design flow effectiveness, it is necessary to raise the power estimator abstraction level. Therefore, we propose in this paper to model the power consumption of existing IPs with high-level parameters (1). The IP

and the power model associated will be placed on a CAD tool library. Thus, the IPs can be characterized at the early stage of the design flow in order to respect the system power constraint. The designer efficiency is enhanced and all models are re-usable with each IP to design a new system.

The paper is organized as follows: In section 2, the methodology used for modeling the IP power consumption is presented and illustrated by case studies which are the LAR space coder and Fast Fourier Transform. Section 3 shows how to use the IP models to estimate the system power consumption. Section 4 concludes this paper and presents future works.



Figure 1. System design flow with high-level power consumption consideration.

# 2 HOW TO MODEL AN IP POWER CONSUMPTION

#### 2.1 Power Characterization Methodology

The FLPA (Functional Level Power Analysis) methodology was applied for modeling the IP power consumption on FPGA. This methodology was developed by [3] in order to extract the processor power consumption model with a set of high level parameters [2]. The FLPA is based on physical measurements which guarantee realistic values with good accuracy. As shown on Figure 2, this methodology has four main parts, which are given below:

- The IPs are considered as black boxes. A primary functional analysis helps to determine which highlevel parameters have an effective impact on the power consumption: relevant algorithmic and architectural parameters are then selected such as entropy in a picture, number of FFT points, filter order, clock frequency, data size and so on.
- Then, the power characterization step explicits the power consumption behavior (obtained by measurements) when each parameter varies independently.
- After curve fitting, the complete power model is obtained; it expresses the whole power consumption variations related to all the parameters with mathematical laws.
- Finally, the accuracy of the model obtained is validated against a new measurements set.

#### 2.2 Power Modelization Methodology

The IP power model is given by the equation (1). It is composed of two terms, which represents the dynamic power (equation (3)) and the static power (equation (2)).

$$P_{IPModel} = P_{Dynamic} + P_{Static} \tag{1}$$

Equation (2) is obtained by statistic analysis of power consumption measurements with a clock frequency of 0 MHz. This equation is a function of the high level parameters and the power consumed by the FPGA configuration plan. This last part depends on the FPGA size and technology.

$$P_{Static} = S(HighLevelParameters) + P_{FPGAPlan}$$
(2)



Figure 2. Power characterization and modeling methodology framework.

The dynamic power is deduced from the other power consumption measurements in order to obtain equation (3), which is composed of clock frequency and the different high-level parameters.

 $P_{Dynamic} = D(HighLevelParameters) \times Frequency$ (3)

## 2.3 Case Studies

In order to illustrate our approach, we proposed in this section to model the LAR space coder and the Fast Fourier Transform. In the first case, the power consumption of a fixed architecture is modeled on Virtex E and Virtex 2 pro FPGA. In the second case, the FFT power consumption is modeled for different kinds of butterfly architecture and a data activity fixed to 50% on a Virtex 2 pro.

#### 2.3.1 LAR space coder

The LAR method (Locally Adaptive Resolution) [1] is used to compress a gray levels image coded on 8 bits. The interest of this methodology is to adapt the local resolution according to the image entropy. Here, LAR space coder is considering as a black box. After a

| Н   | W   | FPGA slices utilization (%)<br>Virtex II Pro XC2VP4 |
|-----|-----|-----------------------------------------------------|
| 16  | 16  | 27.86                                               |
| 64  | 8   | 28.96                                               |
| 64  | 16  | 29.26                                               |
| 128 | 32  | 32.11                                               |
| 256 | 128 | 46.61                                               |
| 352 | 288 | 80.45                                               |

Table 1. LAR space coder FPGA slices utilization for different image sizes.

functional analysis the most significant parameters are retained. The first parameter is the entropy which is due to the image activity. The second is the clock frequency which depends on timing constraint. And, the last parameters are the image height and width which have an impact on the FPGA slices utilization. Indeed, as shown on table 1, the FPGA resources used by the LAR space coder depend on image size.

The LAR space coder was modeled on two FPGA families, which are the Virtex 300E and the Virtex 2 pro XC2VP4 with a similar law. The generic model is represented by the equation (4) for a parameter range given by equation (5).

$$P_{LARCoder}(mW) = (\alpha \times Ent + \beta) \times f_{MHz} + W \times \delta + H \times \epsilon + \lambda_{Plan}$$
(4)

 $Ent \in [0;8], f_{MHz} \in [0;50], W \in [16;352], H \in [16;288]$ (5)

The specific model parameters for each component are given by the table 2. The slice number for the both FPGA is nearly equivalent. We have noticed a difference between  $\lambda_{Plan}$ ,  $\epsilon$  and  $\delta$  parameters. This is an obvious difference which is due to transistor technology. Indeed, Virtex E have a 18  $\mu m$  transistor sizing and 13  $\mu m$  for the Virtex 2 pro. So, the static power consumed is lower in the first case than in the second one. The dynamic power consumption depends on  $\alpha$ and  $\beta$  parameters. The last parameters are higher for the Virtex E than the Virtex 2 pro. Indeed, the core voltage supply for the Virtex E is 1,8 Volts against 1,5 Volts for the Virtex 2 pro.

The error obtained between model estimation against measurements are indicated on table 3. On average, the power estimations are given with a good accuracy.

Table 2. Parameters values obtained for the both FPGA families.

| -      | $\alpha$ | $\beta$ | δ      | $\epsilon$ | $\lambda_{Plan}$ |  |
|--------|----------|---------|--------|------------|------------------|--|
| XC2VP4 | 0.130    | 0.723   | 0.0066 | -0.0016    | 21.504           |  |
| V300E  | 0.196    | 1.048   | 0.0032 | -0.0008    | 6.1              |  |

Table 3. Models accuracy against measurements

| montor |            |            |           |
|--------|------------|------------|-----------|
| -      | Max. error | Min. error | Av. Error |
| -      | (%)        | (%)        | (%)       |
| XC2VP4 | 14.51      | 0.01       | 4.83      |
| V300E  | 18.43      | 0.24       | 7.81      |

#### 2.3.2 Fast Fourier Transform

Here, we proposed to model the power consumption of Xilinx FFT IP [6] for different kinds of butterfly architecture which are radix 2, radix 4 and pipelined. In order to simplify this work we have fixed the data activity to 50%. Like for the LAR space coder, high level parameters were determined after a functional analysis. We retained 2 parameters which are:

- $f_{MHz}$ : the FFT clock frequency.
- N: Number of FFT points.

We decomposed the FFT power consumption such as an equation of 3 terms (equation(6)).

$$P_{FFT} = D(N) \cdot f_{MHZ} + S(N) + \lambda_{plan} \tag{6}$$

For the 3 kinds of architecture, we obtained 3 models which are given by the equation (7), (8) and (9) respectively for a parameter range given by equation(10).

$$P_{FFT_{Radix2}} = (\alpha \times LN(N) + \beta) \times f_{MHz} + \delta \times N + \lambda_{plan}$$
(7)

$$P_{FFT_{Radix4}} = (\alpha \times LN(N) + \beta) \times f_{MHz} + \delta \times N + \lambda_{plan}$$
(8)

$$P_{FFT_{Pipelined}} = (\alpha \times N + \beta) \times f_{MHz} + \delta \times N + \lambda_{plan} \quad (9)$$

$$f_{MHz} \in [0; 50], N \in [8; 2048] \tag{10}$$

The model coefficients are given by table 4 and model accuracy by table 5.

Table 4. Coefficient values obtained for eachFFT architecture.

| Butterfly    | Radix 2 | Radix 4 | pipelined |  |  |  |
|--------------|---------|---------|-----------|--|--|--|
| architecture | -       | -       | -         |  |  |  |
| $\alpha$     | 0.25    | 0.41    | 0.036     |  |  |  |
| eta          | 1.0036  | 1.8685  | 3.161     |  |  |  |
| δ            | 0.003   | 0.0057  | 0.024     |  |  |  |
| $\lambda$    | 21.54   | 20.62   | 17.95     |  |  |  |

Table 5. Errors obtained for each FFT models.

| Butterfly    | Radix 2 | Radix 4 | pipelined |
|--------------|---------|---------|-----------|
| architecture | -       | -       | -         |
| Max. error   | 10.88   | 26.12   | 14.18     |
| Min. error   | 0.06    | 0.02    | 0.11      |
| Aver. error  | 3       | 5.26    | 7.39      |
| Meas. Number | 224     | 161     | 229       |

## 3 SYSTEM ESTIMATION METHOD-OLOGY

In this section, it is proposed to use these IPs model in order to estimate the power consumption of a full system design. We have considered a system of N IPs. We assumed that the power consumed by the system is the sum of the power consumed by the N IPs and the power consumed by the FPGA configuration plan, such as is given by the equation (11).

$$P_{System} = P_{FPGAPlan} + \sum_{i=1}^{N} P_{IPi}$$
(11)

The power consumed by each IP is given by the equation (12); e.g.: for the FFT or the LAR space coder, it is the previous models without the  $P_{FPGAPlan}$  term.

$$P_{IPi} = S(HighLevelParameters)_i + D(HighLevelParameters)_i \times Frequency_i$$

(12)

To validate this approach, various IP combinations were implemented up to five. Table 6 shows the average error obtained between estimations and measurements when the IP number is increasing. The average error obtained for one IP is 5 %. When the IP number is increasing, the average error raises around to 10 %. When the IP is modeled alone on the FPGA, a power consumption overhead is due to the interconnections length between the IP core and the FPGA pads. When,

# Table 6. Average error obtained for various IP number implemented on FPGA.

| IP number      | 1    | 2    | 3     | 4     | 5     |
|----------------|------|------|-------|-------|-------|
| Aver. error(%) | 5.07 | 8.94 | 10.66 | 10.58 | 10.22 |

the IPs are associated, the interconnections length between 2 IPs is reduced. Thus, in all cases, the power is overestimated.

# 4 CONCLUSION AND FUTURE WORKS

In this paper, we have presented an efficient highlevel power estimation for SoC based on IP power modeling methodology which is validated on FPGA. During the first step of a system design flow, these models help the designer to develop an application under a power constraint. The time consuming in design backtrack is strongly reduced and model estimation is re-usable for designing a new system. Future works will consist in defining an IPs models formalism and to increase the IPs library, in particular those dedicated to communication such as NoC.

## References

- M. Babel, O. Deforges, and J. Ronsin. Lossless and lossy minimal redundancy pyramidal decomposition for scalable image compression technique. 4th IEEE International Conference on Multimedia and Expos (ICME), 1999.
- [2] N. Julien, J. Laurent, E. Senn, and E. Martin. Power consumption modeling and characterization of the TI C6201. *IEEE Micro Volume 23, Issue 5. Page(s):40 -*49, Sept-Oct 2003.
- [3] J. Laurent, N. Julien, E. Senn, and E. Martin. Functional level power analysis: An efficient approach for modeling the power consumption of complex processors. *IEEE DATE*, 2004.
- [4] J. M. Rabaey and M. Pedram. Low power design methodologies. *Kluwer Academic Publisher*, ISBN 0-7923-9630-8, 1996.
- [5] R. Saleh, G. Lim, T. Kadowaki, and K. Uchiyama. Trends in low power digital system-on-chip designs. *In*ternational Symposium on Quality Electronic Design (ISQED), 2002.
- [6] Xilinx. Fast fourier transform v3.0. DS260 21 MAY 2004.