### Efficient Smart Sampling based Full-Chip Leakage Analysis for Intra-Die Variation Considering State Dependence

Vineeth Veetil, Dennis Sylvester, David Blaauw, Saumil Shah\*, Steffen Rochel\* EECS Department, University of Michigan, Ann Arbor, MI Blaze DFM, Sunnyvale, CA\* {tvvin,dennis,blaauw}@eecs.umich.edu,saumil@umich.edu,s.rochel@ieee.org

### Abstract

Leakage power minimization is critical to semiconductor design in nanoscale CMOS. On the other hand increasing variability with scaling adds complexity to the leakage analysis problem. In this work we seek to achieve tractability in Monte Carlo-based statistical leakage analysis. A novel approach for fast and accurate statistical leakage analysis considering inter-die and intra-die components is proposed. We show that the optimal way to select samples, to capture intra-die variation accurately, is according to the probability distribution function of total process variation. Intelligent selection of samples is performed using a Quasi Monte Carlo technique. Results are presented for benchmarks with sizes varying from approximately 5,000 to 200,000 gates. The largest benchmark with 198461 gates is evaluated in 3 minutes with the proposed approach compared to 23 hours for random sampling with comparable accuracy. Compared to a conventional analytical approach using Wilkinson's approximation, the proposed technique offers superior accuracy while maintaining efficiency. State dependence and multiple sources of variation are considered and the approach is scalable with number of process parameter variables for standard cell characterization cost. We also show reduction in sample size to meet target accuracy for computing leakage distribution due to the inter-die component only when compared to random selection of samples.

### **Categories and Subject Descriptors**

J.6 [Computer Applications] Computer-Aided Design - *computer-aided design (CAD).* 

#### **General Terms**

Algorithms, Verification

### Keywords

Monte Carlo, Variance reduction, Statistical leakage

### 1. Introduction

As circuit design moves to smaller technology nodes the standby power dissipation of devices is of critical concern. According to the 2007 ITRS roadmap circuit leakage control is a challenge for both high performance and low power design. For high performance design increasing design complexity and leakage scaling makes it difficult to control static power while meeting performance requirements. For low power design increase in leakage power is most challenging. Thus accurate and efficient leakage analysis is crucial for the designer. On the other hand increasing process variation with scaling adds complexity to leakage analysis. A promising solution is to perform statistical analysis of leakage and use this to guide leakage optimization and design changes.

Current approaches to calculate full-chip leakage power can be classified into two main categories. The first category of methods are analytical in nature. These attempt to model full chip leakage using a standard distribution, most commonly a lognormal distribution. The moments of this distribution are computed by matching moments with an expression

DAC'09, July 26-31, 2009, San Francisco, California, USA Copyright 2009 ACM 978-1-60558-497-3/09/07....10.00 involving summation of leakage distributions at the gate level[1-4]. In [1] a lognormal distribution is used to approximate the leakage current of each gate and the total leakage is obtained by summing the lognormals. A low rank quadratic approximation to capture non-lognormal leakage distributions is proposed in [2]. It is noted that a 20% error is observed when modeling leakage distributions as purely lognormal using a linear approximation. The authors in [3] attempt to capture high level characteristics of a candidate chip design for early mode leakage estimation. In [4] the authors propose a systematic characterization of leakage related parameter variations. A quadratic model of the logarithm of leakage current is also proposed. Traditionally these approaches have provided the desired accuracy. However they make assumptions about either the nature of the statistical distribution of process variation parameters or the nature of the dependence of standard cell leakage on the underlying variables for handling process variation. The process variation parameters are assumed to have a standard distribution, most commonly Gaussian, or the logarithm of standard cell leakage is assumed to be a linear or quadratic sum of the variables modeling process variation. It is not clear that these assumptions will still hold true considering secondary effects in process variation and a growing number of variation sources at technology nodes below 45nm.

The second category of methods fall into the classification of Monte Carlo based techniques involving selection of samples in the process variation space and using these samples to compute leakage distribution. Monte Carlo techniques can handle non-standard distribution of process parameters and lookup tables for dependence of standard cell leakage on process variables. Therefore they do not require simplifying assumptions about the dependence of leakage on process parameters or the nature of process parameter distribution, making them highly scalable. Also the inherent parallelism in evaluating Monte Carlo samples make these techniques amenable to multi-core and Graphics Processing Unit (GPU) computing. However Monte Carlo techniques typically require a large sample size rendering them expensive. There is a need for smart selection of samples to reduce the number of samples that require evaluation without compromising accuracy. In [5] the author describes such techniques, known as variance reduction techniques. These techniques need to be tailored to the system under consideration for efficient reduction in sample size. In the context of integrated circuits it has been shown that a suitable choice of these techniques can lead to significant sample size reduction for statistical timing analysis[6]. We study the applicability of such techniques for leakage analysis in this work.

There are two main contributions in this paper. To the best of our knowledge this work is the first to study sample size reduction for statistical leakage analysis using a Monte Carlo based approach. We consider intra-die variation, state dependence and multiple sources of process variation. Second, we address the issue of standard cell characterization, which is largely ignored in literature. Statistical circuit leakage analysis involves characterization of standard cells at grid points in the process variation space. This is illustrated in the schematic for a traditional flow in Figure 1. Although characterization is only performed once in the design flow for a library the number of grid points grows exponentially with the number of process variation parameters. There is a need to select samples to reduce characterization cost while meeting target accuracy in leakage analysis.

We first consider the problem of leakage analysis for the case of interdie variation involving multiple process variation parameters. For this we

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.



# Figure 1. Traditional and proposed leakage analysis flow for global variation with multiple sources.

propose to use a Quasi Monte Carlo technique [7] for selecting samples in the process variation space. We show that for a large benchmark circuit there is significant reduction in sample size to meet target accuracy when compared to a random selection of samples for computing leakage distribution. Standard cell characterization needs to be performed only at these samples which reduces the cost of standard cell leakage characterization. Next we propose a solution for the case of the total leakage distribution considering inter-die and intra-die components which is the major contribution of this paper. We recognize that this problem can be formulated as selecting samples for inter-die variation and computing the local distributions at each of these samples due to intra-die variation. Computation of the moments of the local distribution requires additional samples in the neighborhood of each inter-die sample. The number of these additional samples can be prohibitively high. We propose techniques for efficient selection of the samples. The key ideas are as follows. First we show that the optimal way to select samples to compute local distributions accurately is to select samples according to the probability distribution function of total process variation. Second, the selection of samples is performed intelligently by using the Quasi Monte Carlo technique. Experiments are performed on benchmark circuits synthesized in a 45nm commercial technology. State dependence information is also considered. We compare our technique with 3 approaches 1) random sampling, 2) a technique referred to as Method1, and 3) a traditional analytical approach based on [1]. Method1 involves smart selection of inter-die samples but no intelligence or reuse of samples for intradie variation. For the largest benchmark considered with 198461 gates, the proposed approach requires 3 minutes whereas random sampling and Method1 complete the task in 23 hours and 18.4 hours, respectively. We also achieve accurate results for estimation of  $\mu$ ,  $\sigma$ , and the 95<sup>th</sup> percentile of chip leakage distribution for all benchmarks considered with low runtime.

The paper is organized as follows. Section 2 describes Quasi Monte Carlo approach, which is a standard technique to reduce sample size for Monte Carlo analysis. Section 3 proposes a leakage analysis technique for the case of inter-die variation using a Quasi Monte Carlo technique. Section 4 addresses leakage analysis for total leakage analysis involving inter-die and intra-die variation using smart samples. Results and conclusions are presented in Sections 5 and 6 respectively.

### 2. Smart Sampling for Leakage Analysis

Monte Carlo-based leakage analysis involves selecting samples in the process variation space to obtain a statistical distribution of circuit leakage. This is mapped to the standard mathematical problem of Monte Carlo (MC), which is to estimate the integral of a function using samples in its domain. There are standard techniques for variance reduction of MC, including Quasi Monte Carlo techniques. These techniques are detailed in [5].

### 2.1 Quasi Monte Carlo

The standard Monte Carlo (MC) method addresses the problem of approximating the integral of a function f(x) over the s-dimensional hypercube  $C^{s} = [0, 1)^{s}$  where x represents a point in an s-dimensional space. The MC estimate of the integral is given by the arithmetic mean of  $f_i$  which are values of the function f(x) evaluated at *n* samples distributed throughout the hypercube. The error bound of a method to numerically estimate an integral using a sequence of samples is mathematically related to a measure of uniformity for the distribution of the points called "discrepancy". A sequence with the smallest possible discrepancy has the property that when used to evaluate the mean it achieves the smallest possible error bound. Sequences constructed to reduce discrepancy are called Low Discrepancy Sequences (LDSs). Quasi Monte Carlo techniques are characterized by their use of LDSs to generate samples. LDSs are deterministic sequences, i.e., there is no randomness in their generation. Intuitively these sequences are well dispersed through the domain of the function, minimizing any gaps or clustering of points. Figure 2 illustrates that quasi random sequences generate samples with lower discrepancy compared to pseudo random sequences (sequences with properties similar to "truly" random sequences). Sobol, Faure, and Niederreiter are LDSs that have been studied extensively. In this work we consider Sobol sequences, which are known to be simple to construct. Interested readers can refer to [7] for a construction of the Sobol sequence. In the context of circuits Quasi Monte Carlo techniques have been studied for statistical timing analysis [6] where results indicate that the techniques are a good fit and are amenable to multi-core and GPU computing. This work is the first to study the application of Quasi Monte Carlo (QMC) techniques for statistical leakage analysis.

## 3. Leakage Analysis for Inter-Die Variation with Smart Sampling

In this section we first describe the steps in an industrial leakage analysis flow. A typical industrial flow circuit leakage analysis involves characterization of a standard cell library and computation of circuit leakage using the characterized data as explained in Section 3.2. Further we introduce our approach to estimation of statistical leakage due to inter-die parameter variation to achieve tractability for multiple sources of process variation.

### 3.1 Process Variation Model

Process variation parameters such as critical dimension (CD) and oxide thickness exhibit correlations. To account for correlations between parameters principal component analysis (PCA) is performed. Critical dimension, threshold voltage and oxide thickness are thus expressed as linear combinations of principal components. For process technology nodes 45nm and below some foundries provide such statistical information with principal component analysis. Now process variation models with inter-die and intra-die components are widely used in the literature [1]. Each process variation parameter has a global or inter-die component, which is modeled by a single random variable for a parameter in a die. Intra-die components account for spatial correlation within the die and uncorrelated random variation per device. In this model the die is partitioned into n \* n grids and identical parameter variations are assumed within a grid. Therefore, each source of variation is represented by a set of random variables, one for each panel in the grid. For example, transistor gate length variation is represented by a set of random variables for all grids and the set is of multivariate normal distribution with covariance matrix  $R_{Lg}$ . As mentioned above the process variation parameters have been resolved into principal components. It follows that each component is represented by a set of random variables for all grids. Principal component analysis (PCA) is again performed on these spatially correlated variables. In addition an independent random variable





# **3.2** Traditional Leakage Analysis Flow for Inter-Die Variation

The standard cell library is characterized for leakage information at grid points in the process variation space. To include state dependence information, standard cells are characterized at the grid points for each input state. If state dependence is not considered then an average of the leakages for all input states is computed.

In a traditional Monte Carlo-based leakage analysis flow (to account for inter-die parameter variation) process parameter variables or their principal components are sampled. As only global variation is considered the same sample set is assigned to every element type in the standard cell library. The leakage value per element type in the library is obtained by interpolation in the leakage lookup table for the element type. The circuit leakage is obtained by adding up the leakage value obtained for each element type after weighting by the number of occurrences of the element type in the circuit.

The above approach does not consider state dependence of standard cell leakage. To enable leakage calculation to account for state dependence, the standard cell characterization data must have leakage information for every cell state as mentioned above. In addition, at the circuit level state probability information is required for every instance of each element type in the circuit. Various approaches exist in the literature to arrive at an estimate of state probability for each instance. For a detailed discussion on this topic refer to [8].

### 3.3 Proposed Leakage Analysis Flow with Smart Sampling

We propose to use Quasi Monte Carlo based sampling for standard cell library characterization and runtime leakage analysis. In a traditional flow standard cells are characterized at discrete grid points in the space of random variables to model process variation as explained in Section 3.2. In the proposed approach the characterization is performed at samples generated using a Quasi Monte Carlo (QMC) based approach. In





Figure 4. Reusing samples for local distribution computation. Inter-die samples weigh the samples in total variation space according to local probability distribution.

particular we use Sobol sequences in the QMC approach in this work. QMC samples refer to Sobol samples in the rest of the paper. The same process variation samples are used for characterization of all element types in the standard cell library and their states.

The proposed approach differs from a traditional flow during runtime in that samples are not generated at this stage. The inter-die samples are precomputed during cell library characterization. A given inter-die sample is assigned to every element type in the library as before and the circuit leakage is obtained by adding up the leakage values from element types as in the traditional flow. It follows that there is now no need for interpolation in the look-up table from cell characterization. The leakage values are readily available in the tables without need for interpolation. The traditional and proposed flows are illustrated in Figure 1.

## 4. Leakage Analysis for Total Variation with Smart Sampling

This section proposes an algorithm for estimating full-chip leakage considering inter-die and intra-die components of variation. In sub-45 nm technologies secondary effects in process variation are important and the number of significant sources of process variation is increasing. Existing approaches to calculate full-chip leakage power make simplifying assumptions about either the nature of statistical distribution of process variation parameters or the nature of dependence of the standard cell leakage on these parameters. The parameters are assumed to have a standard distribution or the logarithm of standard cell leakage is assumed to be a linear or quadratic sum of the parameters. Combined with a growing number of process variation sources this is a limitation on the accuracy. Monte Carlo based methods on the other hand are expensive when handling intra-die variation. The proposed approach can efficiently handle any non-standard distribution of variables or dependence of full-chip leakage on these variables.

A schematic of the proposed approach for total variation is illustrated in Figure 3. Process variation consists of inter-die and intra-die components. In Section 3 we discussed generation of samples in the space of inter-die variation distributed according to the joint probability distribution of the variables involved. We apply intra-die variation to such a sample around the nominal and obtain a local leakage distribution for the circuit. The sum of these distributions from all samples should give the total leakage distribution. From a sampling perspective this translates to generating more samples distributed according to the intra-die distribution around each inter-die sample. In this way the problem of total leakage variation can be formulated as a two-level sampling problem, the first level corresponds to inter-die variation and the second level corresponds to intra-die variation at each of the samples in the first level. Using Quasi Monte Carlo sampling, accurate results for inter-die variation can be achieved with few samples as explained in Section 3. However even for a low number of first level, or inter-die, samples the total number of samples in the second level can be prohibitively high. The



Figure 5. (a) Total (Inter+Intra die) distribution and local pdf at an inter-die sample. (b) QMC based samples are generated according to total variation. (c) For computing mean of local pdf the samples generated in (b) are weighed according to the ratio of the probabilities in the two distribution functions.

idea here is that if the second level samples are chosen optimally such that either the entire set or a subset can be used for computation at every inter-die sample the number of samples can be minimized. A uniform sampling approach in a bounded space enveloping the inter-die samples may be tried. However while this considers outliers in the inter-die distribution this does not weigh samples close to the nominal adequately. The problem is to arrive at a pdf which is optimal for all samples.

Consider the first level or inter-die samples in the process variation space. The problem is to find a pdf for optimality in computation at every inter-die sample. Such a pdf is obtained by summation of the pdfs of local distributions at the inter-die samples. Now we have the surprisingly simple result that if the number of inter-die samples is large enough the summation of the pdfs converges to the pdf for the distribution obtained for total variation with inter-die and intra-die components. The proof has been omitted for brevity. Our experiments indicate that if the inter-die samples are chosen according to a Sobol sequence and the sample size is large enough (typically more than 100) this is indeed true. Therefore we select the second level samples according to the pdf for total variation. To minimize the number of second level samples we use Sobol sequences to sample in this space.

For the case of no spatial correlation the idea is illustrated in Figure 4 where two samples are shown on the inter-die distribution. The second level samples are chosen to be Quasi Monte Carlo based samples in the total process variation space. One such sample in Figure 4 lies in different regions of the pdf for the two inter-die samples. Therefore the first level samples assign different weights to the leakage values obtained at a particular second level sample. The characterization step needs to compute leakages for standard cells at the second level samples only. The procedure to reuse samples is illustrated in Figure 5 for the case of a 2D process variation space. Figure 5a shows the total process variation distribution along with the local distribution at an inter-die sample *S*. Figure 5b shows the second level samples are reused for computation of moments of local distribution at S as in Figure 5c. In particular the mean of local distribution at S for the circuit,  $\overline{L}(S)$  is given by

$$\overline{L}(S) = \sum_{i=1}^{N} \frac{L(x_i) \times JpdfIntra(x_i - S)}{JpdfTotal(x_i)}$$
(1)

where *JpdfIntra* is the probability distribution for intra-die variation and *JpdfTotal* is the probability distribution for total variation. Similarly higher moments for the local distribution can be computed. The total leakage distribution is a sum of local leakage distributions and is computed using  $\overline{L}(S)$  and the higher moments obtained for all samples. In the case of spatially correlated intra-die variation, the sample for one variable is not a single value but a set of values corresponding to grids in the spatial correlation model. This means that each element of vector  $x_i$  in (1) is not a scalar but a vector with correlated elements. The number of elements in this vector is equal to the number of grids. The functions JpdfIn $tra(x_i-S)$  and  $JpdfTotal(x_i)$  are modified to include the spatial correlation. We explore spatial correlation later and show that this level of modeling process variation is not needed for large circuit and full-chip leakage analysis.

Now the local distribution corresponding to one sample can be approximated using Central Limit Theorem abbreviated as CLT [9]. If spatial correlation is not considered then this local distribution has contribution from sum of identical independent random variables from instances of a given element type in the cell library. If there are enough instances the local distribution approaches a normal distribution. Also for a large number of instances the variance of this distribution approaches zero according to CLT. This means that the local distribution approaches a single number which is the mean of the distribution. In the presence of spatial correlation as long as there are sufficient independent regions in a die, i.e., the circuit is large enough the Central Limit Theorem can be applied [10] as if all intra-die variation was uncorrelated. A reduction in variance of the local distribution translates to a reduction in the number of second level or additional samples for a target accuracy. For large circuit blocks and chips the problem essentially is to compute only the mean of the local distribution at each inter-die sample. For circuits where spatial correlation has a significant effect on leakage distribution, the technique can still be applied. The local distribution within a grid panel has contribution from the sum of identical independent random variables from instances of a given element type in the cell library. Therefore the local distribution within each grid panel approaches a normal distribution with number of instances in the panel, which reduces the number of additional samples, with spatial correlation considered, to capture the local distribution.

### 5. Results

Our simulation results are based on a 45nm commercial technology. Principal component analysis is used to obtain principal components for the correlated process variation parameters including CD, oxide thickness and threshold voltage. Simulations are performed on industrial circuits with sizes ranging from approximately 5000 to 200,000 gates. In our implementation, we only consider inter-die variation and uncorrelated intra-die variation. Spatially correlated intra-die variation is not implemented. In the presence of spatial correlation as long as there are sufficient independent regions in a die, i.e., the circuit is large enough, the Central Limit Theorem can be applied as explained in [10] and therefore the results are accurate for large circuit blocks and chips. This is illustrated in Figure 6 for a benchmark circuit with approximately 43,000 gates. The standard deviation of the leakage distribution without considering spatial correlation is compared to the case where a grid-based spatial correlation model is considered. The total standard deviation for

|        |               | Golden (Monte Carlo 20k samples) |           |                                     |            | Proposed approach |           |                         |                | Error (%) |      |                                  |             |
|--------|---------------|----------------------------------|-----------|-------------------------------------|------------|-------------------|-----------|-------------------------|----------------|-----------|------|----------------------------------|-------------|
|        | Gate<br>count | μ<br>( <b>mW</b> )               | σ<br>(mW) | 95 <sup>th</sup> percentile<br>(mW) | Runtime    | μ<br>( <b>mW)</b> | σ<br>(mW) | 95th percentile<br>(mW) | Runtime<br>(s) | μ         | σ    | 95 <sup>th</sup> percen-<br>tile | Speed<br>up |
| VD1    | 5536          | 0.51                             | 0.18      | 0.85                                | 1.7 hours  | 0.51              | 0.17      | 0.87                    | 1.77 seconds   | 0.03      | 3.55 | 2.21                             | 3405        |
| VD2    | 13258         | 1.21                             | 0.42      | 1.99                                | 4.8 hours  | 1.20              | 0.40      | 2.04                    | 1.83 seconds   | 0.30      | 2.61 | 2.51                             | 9495        |
| USB    | 15946         | 1.11                             | 0.36      | 1.79                                | 7.4 hours  | 1.11              | 0.36      | 1.85                    | 1.95 seconds   | 0.01      | 1.97 | 3.35                             | 13738       |
| ETHER  | 23939         | 1.40                             | 0.46      | 2.26                                | 10.2 hours | 1.40              | 0.45      | 2.33                    | 2.09 seconds   | 0.06      | 1.99 | 3.10                             | 17633       |
| VGA    | 43214         | 2.85                             | 0.98      | 4.71                                | 15.6 hours | 2.84              | 0.96      | 4.85                    | 2.02 seconds   | 0.49      | 2.31 | 2.97                             | 27778       |
| *Chip1 | 198461        | 10.63                            | 2.67      | 15.59                               | 19.2 days  | 10.64             | 2.63      | 15.96                   | 278 seconds    | 0.10      | 1.71 | 2.37                             | 5969        |

 Table 1. Comparison of proposed approach with Golden ( Monte Carlo 20,000 samples) for benchmarks. \* indicates that state probability is considered for instances in the circuit.

intra-die variation is the same in both cases. The assumption of no spatial correlation accurately estimates standard deviation for number of grid panels above 256, supporting the argument in [10]. Therefore, spatial correlation is not a limitation for circuit blocks and chips with practical sizes for the current implementation, which is our focus in this work. The modification in the algorithm for the case of smaller circuits is discussed in Section 4.

Figure 7 shows the result for our proposed approach for inter-die parameter variation using smart samples. The smart samples are obtained from a Sobol sequence. The error in estimating  $\sigma$  of leakage distribution



Figure 6. Comparison of sigma of leakage distribution without considering spatial correlation with that of a grid-based spatial correlation model for VGA circuit (43214 gates)



Figure 7. Comparison of error in estimating σ of leakage distribution for inter-die variation using QMC vs random sampling for VGA circuit(43214 gates).

for inter-die variation using smart samples is compared with a random sampling based approach for a VGA circuit with approximately 43,000 gates. The golden value is obtained from a Monte Carlo simulation with 20,000 samples. We compare the minimum sample size required to achieve target accuracy of 3% error in estimating  $\sigma$  for both methods. The proposed approach requires 9.3X fewer samples compared to random sampling. In a typical industrial flow the standard cells are characterized at grid points in the process variation space. With 7 grid points chosen for each of the three principal components in our implementation the number of points to be characterized is 343 in a traditional flow whereas the proposed approach requires only 150 from Figure 7, a 56% reduction in standard cell characterization overhead.

We now present our results for total process variation considering both inter-die and intra-die components of variation. Table 1 shows results comparing the proposed approach with 20,000 Monte Carlo runs on benchmark circuits. The metrics compared are mean  $\mu$ , sigma  $\sigma$ , and the 95<sup>th</sup> percentile of the circuit leakage distribution. The errors in estimating these metrics for the largest benchmark circuit *Chip1* are less than 3%. The errors in estimating the metrics are less than 3.6% for all the benchmark circuits. Note that there is higher accuracy for the largest benchmark studied. The proposed approach has a runtime of less than 3 minutes for the largest benchmark, which illustrates the runtime efficiency. The larger runtime for *Chip1*, even accounting for the larger circuit size, is attributed to the fact that state probability information is only considered for this circuit. State probability consideration for each instance adds significant cost to the computation.

Figure 8 plots the accuracy against runtime of the proposed approach and a random sampling approach. We also compare this with the result



Figure 8. Comparison of accuracy of proposed approach with random sampling based approach vs runtime. The circuit considered is *Chip1* with 200,000 gates.

for another smart sampling based technique called Method1. As explained in Section 4 the proposed approach first generates inter-die samples using smart sampling. In the next step a smart selection of samples in the total variation space is coupled with reuse of these samples to compute the mean of local leakage distributions at inter-die samples. In *Method1* inter-die samples are generated using a Sobol sequence as in the proposed approach. However a random sampling based Monte Carlo analysis is performed at each inter-die sample to obtain the local distribution. In other words there is no intelligence or reuse of samples in total variation space, however as inter-die samples are generated using smart samples this method is expected to be faster than random sampling. Figure 8 shows that the proposed approach has a runtime of less than 3 minutes to achieve target accuracy for the largest benchmark whereas Method1 has a runtime of 18.4 hours. This result illustrates the advantage of smart sampling and reuse of the additional samples in the total variation space. The random sampling approach has a runtime of 23 hours. It may be noted that the slope of the curve for Method1 is steep in the beginning compared to the rest of the curve. This is because in Method1 the number of inter-die samples is increased in the beginning till the inter-die component of variation is captured accurately. After that only the number of random samples to capture the local distribution is increased while keeping the number of inter-die samples constant, hence the decrease in slope. The slow convergence of random sampling to capture local distribution is the reason for comparable runtimes of Method1 and random sampling.

Table 2 compares the proposed approach with an analytical approach to compute leakage distribution based on [1]. In [1] the authors approximate the logarithm of gate leakage as a linear expression involving process variation variables. Wilkinson's approximation is used to compute sum of lognormals to obtain circuit leakage as a lognormal expression. From Table 2 the maximum error in estimating  $\mu$  is 3.7% for the analytical approach compared to 0.5% for the proposed approach. Similarly the maximum error in estimating  $\sigma$  is 6.1% for the analytical approach compared to 3.6% for the proposed approach. It may also be noted that the proposed approach incurs less error as circuit size increases but no such trend is observed for the analytical approach. For the largest benchmark Chip1 state dependence has been implemented for both methods. The errors in estimating  $\mu$  and  $\sigma$  are significantly lower for the proposed approach in this case as illustrated. As mentioned before the runtime for Chip1 is significantly higher compared to other circuits, even accounting for circuit size because state probability information of instances is considered in this circuit. In the case of the analytical approach the increase in time cost is much higher because the dependence on number of states is quadratic.

Figure 9 compares the total leakage distribution of the largest benchmark circuit with 200,000 gates for the proposed approach with the golden and the analytical approach based on [1]. The leakage variation considering only inter-die variation is also plotted. This analysis considers state probability information for instances in the circuit. The state probability information is extracted using a commercial tool. We see that

Table 2. Comparison of proposed approach with Wilkinson's based approach. \* indicates that state probability information is considered for instances in the circuit.

| Circuit | Gate   | Prop | osed a | pproach | Wilkinson's |      |         |  |
|---------|--------|------|--------|---------|-------------|------|---------|--|
|         | Count  |      |        |         | approach    |      |         |  |
|         |        | % E  | rror   | Run-    | % E         | Run- |         |  |
|         |        | μ    | σ      | time(s) | μ           | σ    | time(s) |  |
| VD1     | 5536   | 0.03 | 3.55   | 1.77    | 3.43        | 4.81 | 0.16    |  |
| VD2     | 13258  | 0.30 | 2.61   | 1.83    | 3.16        | 1.80 | 0.16    |  |
| USB     | 15946  | 0.01 | 1.97   | 1.95    | 3.62        | 6.13 | 0.19    |  |
| ETHER   | 23939  | 0.06 | 1.99   | 2.09    | 3.69        | 0.53 | 0.20    |  |
| VGA     | 43214  | 0.49 | 2.31   | 2.02    | 3.03        | 1.37 | 0.20    |  |
| *Chip1  | 198461 | 0.10 | 1.71   | 278     | 3.08        | 5.46 | 3094    |  |



Figure 9. Total leakage distribution considering intra-die variation for Chip1 (200,000 gates). Proposed approach is compared with the analytical approach based on [1] and the golden. The distribution due to inter-die variation is also plotted. The analysis considers state dependence of leakage for the instances in the circuit.

the distribution curve is captured with accuracy by the proposed approach whereas there is significant error with the analytical approach.

#### 6. Conclusions

Monte Carlo-based techniques are promising for statistical leakage analysis because of the generality and scalability of the approach even when complex relations exist between leakage and process parameters. This work addresses the problem of reducing the sample size for Monte Carlo based leakage analysis. When considering only inter-die variation for a large benchmark circuit the sample size is reduced by 9.3X compared to a random sampling approach to achieve target accuracy. The standard cell characterization cost is also reduced by 56%. We also propose a solution to estimate the total leakage distribution considering inter-die and intra-die components. A novel technique involving smart sampling combined with reuse of samples is introduced to address this issue. The proposed approach is compared with random sampling, Method1 where samples are not reused, and an analytical approach. For the largest benchmark considered the proposed approach performs the computation in 3 minutes whereas the random sampling approach and Method1 complete the task in 23 hours and 18.4 hours, respectively. The analytical approach has up to 3.7% and 6.1% in approximating  $\mu$  and  $\sigma$ compared to 0.5% and 3.6% for the proposed approach. In addition the characterization cost for the total leakage distribution is scalable with respect to the number of process variation variables since Quasi Monte Carlo sample size increases moderately with the number of variables whereas in a traditional grid-based characterization approach the cost grows exponentially with the number of process variation variables.

#### References

- H., S.S.Sapatnekar, "Full-chip analysis of leakage power under process variations including spatial correlations", *Proc. Design Automation Conference*, pp. 523-528, 2005.
   X. Li, J.le, L.T.Pileggi, "Projection-based statistical analysis of full-chip leakage power
- with non-log-normal distributions", Proc. Design Automation Conference, pp. 103-108,
- K.R.Heloue, N.Azizi, F.N.Najm, "Modeling and estimation of full-chip leakage current considering within-die correlation", *Proc. Design Automation Conference*, pp. 93-98, [3] 2007
- T.Li, W.Zhang, Z.Yu, "Full-chip Leakage Analysis in Nano-scale Technologies: Mecha-nisms: Variation Sources, and Modeling", Proc. Design Automation Conference, pp. 594-1001 (2010). [4] 599 2008
- Syn, 2008.
   R.Y.Rubinstein, Simulation and the Monte Carlo Method, John Wiley & Sons Inc., 1981.
   S.Y.Rubinstein, Simulation and the Monte Carlo Method, John Wiley & Sons Inc., 1981.
   V.Veetil, D.Sylvester, D.Blaauw, "Efficient Monte Carlo based Incremental Statistical Timing Analysis", *Proc. Design Automation Conference*, pp. 676-681, 2008.
   I.M.Sobol, "The Distribution of Points in a Cube and the Approximate Evaluation of Inte-grals", *USSR Comp. Math and Math. Phys.*, 7(4), pp. 86-112, 1967.
   F.N.Najm, "A Survey of Power Estimation Techniques in VLSI Circuits," *IEEE Transac-User Visual Visual Visual Conference on Med Conference*, 2008.

- tions on Very Large Scale Integration(VLSI) Systems, Vol. 2, pp. 446-455, 1994. A. Papoulis, Probability, Random Variables and Stochastic Processes, McGraw-Hill Inc., [9] New York 1991.
- [10] R. Rao, A. Devgan, D.Blaauw, D.Sylvester, "Parametric Yield Estimation Considering Leakage Variability," Proc. Design Automation Conference, pp. 442-447, 2004.