Design Fault Directed Test Generation for Microprocessor Validation

Deepak A. Mathaikutty, Sandeep K. Shukla
FERMAT Lab, Virginia Tech
Blacksburg, VA 24061
{mathaikutty,shukla}@vt.edu

Sreekumar V. Kodakara, David Lilja
The University of Minnesota
Minneapolis, MN 55455
{sreek,lilja}@ece.umn.edu

Ajit Dingankar
Validation Tools
Intel Corporation
Folsom, CA 95630
ajit.dingankar@intel.com

Abstract

Functional validation of modern microprocessors is an important and complex problem. One of the problems in functional validation is the generation of test cases that has higher potential to find faults in the design. We propose a model based test generation framework that generates tests for design fault classes inspired from software validation. There are two main contributions in this paper. Firstly, we propose a microprocessor modeling and test generation framework that generates test suites to satisfy Modified Condition Decision Coverage (MCDC), a structural coverage metric that detects most of the classified design faults as well as the remaining faults not covered by MCDC. Secondly, we show that there exists good correlation between types of design faults proposed by software validation and the errors/bugs reported in case studies on microprocessor validation. We demonstrate the framework by modeling and generating tests for the microarchitecture of VESPA, a 32-bit microprocessor. In the results section, we show that the tests generated using our framework’s coverage directed approach detects the fault classes with 100% coverage, when compared to model-random test generation.

1. Introduction

Functional validation has become a key ingredient in the development cycle of current and future microprocessors. Simulation is widely used to validate large systems like microprocessors. In simulation based validation, a test is executed in a golden reference model as well as in the design under test (DUT), which in this case is the RTL implementation of the microprocessor. The success of simulation based validation depends on the quality of tests. A potential fault in the RTL is exposed only if the test resulted in a state deviation of the RTL from that predicted by the golden model. Coverage metrics inspired from software validation (statement, branch and path coverage), state machine representation (state, transition and path coverage) and functionality of the microprocessor is used to measure the quality of tests. Effectiveness of coverage directed test generation depends on the types of faults that can occur and the strength of the coverage metric to detect these faults.

We analyze the bug descriptions and modeling errors reported during the study of microprocessor validation in [2, 13] and relate them to the fault types observed during software validation [7]. The outcome is a high correlation between the fault types and the modeling errors seen in the control logic of microprocessors. Hence, we propose a model based test generation framework that detects these fault classes.

The proposed framework allows modeling the microprocessor at the architectural and microarchitectural abstraction. The modeling language is developed as a metamodel, which provides the syntax and semantic necessary to describe the architectural and microarchitectural models. The framework translates these models into constraint satisfaction problems [4] that are resolved through a highly scalable constraint solver. The architectural model is converted into architectural constraints that are solved to generate random test suites that validates the RTL implementation. However, to improve the quality of the test suite generated, we enable our framework with coverage-directed test generation capability. The coverage metric is used to generate additional constraints that are given to the solver along with the constraints obtained from the model. The validation flow is shown in Figure 1.

![Figure 1. Validation Flow](image)

We support traditional coverages such as statement and branch as well as modified condition decision coverage and design fault coverage. Modified Condition Decision Coverage (MCDC) [6] is a coverage metric that is widely used for the validation of critical software systems like avionics. The test suite generated as result of MCDC-directed test generation is found to detect most of the design fault classes proposed in [7]. Therefore, we propose MCDC as a metric for microprocessor validation. The design fault classes not covered as result of MCDC is detected by generating additional constraints that tune the solver to generate solutions that test them. We illustrate our test generation framework by creating test suites for VESPA [8], a 32-bit pipelined microprocessor. We provide results to highlight the importance of MCDC-directed test generation for design fault coverage.
2. Related Work

Test generation for simulation based validation can be broadly classified into random and coverage directed test generation. Random test generators [3] are extensively used to validate microprocessors. Several techniques [5, 9, 12] were proposed in the literature for coverage directed test generation, which differ in the coverage metric being targeted and the techniques used to generate test cases.

Ur and Yadin [12] presented a method for generation of assembler test programs that probe the microarchitecture of a PowerPC processor. Benjamin et al. [5] describe a verification methodology that uses coverage of formal models to specify tests for a superscalar processor. In [9], Mishra et al. propose graph-based test generation for validation of pipelined processors that achieves functional coverage. Andrew Piziali [11] has presented a comprehensive study on functional verification coverage measurement and analysis. To the best of our knowledge, there are no previous approaches that propose a model based test generation for design fault coverage on microcode and RTL by applying MCDC and CSP formulation.

3. Background

We briefly outline the different fault classes and introduce the coverage metrics satisfied by our test generation.

<table>
<thead>
<tr>
<th>Fault Class</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Expression Negation Fault (ENF)</td>
<td>A subexpression of $S$ is negated</td>
</tr>
<tr>
<td>Term Omission Fault (TOF)</td>
<td>A term $t_i$ is omitted</td>
</tr>
<tr>
<td>Term Negation Fault (TNF)</td>
<td>A term $t_i$ is erroneously negated</td>
</tr>
<tr>
<td>Literal Omission Fault (LOF)</td>
<td>A literal $x_j$ is omitted from a term $t_i$</td>
</tr>
<tr>
<td>Literal Insertion Fault (LIF)</td>
<td>An extra literal $x_{k+1}$ is inserted into a term $t_i$</td>
</tr>
<tr>
<td>Literal Negation Fault (LNF)</td>
<td>A literal $x_j$ is erroneously negated in a term $t_i$</td>
</tr>
<tr>
<td>Literal Reference Fault (LRF)</td>
<td>A literal $y$ is referenced instead of $x_j$ in a term $t_i$</td>
</tr>
<tr>
<td>Disjunctive Operator Reference Fault (ORF)</td>
<td>A boolean operator (+) in $S$ is replaced by (+)</td>
</tr>
<tr>
<td>Conjunctive Operator Reference Fault (ORF)</td>
<td>A boolean operator (·) in $S$ is replaced by (·)</td>
</tr>
</tbody>
</table>

Fault Classes: Lau and Yu [7] proposed a comprehensive set of nine fault classes related to boolean expressions in disjunctive normal form (DNF). Let us consider a boolean expression in DNF form with $m$ terms $S = t_1 + \ldots + t_m$. Let $t_i$ denote the $i^{th}$ term in $S$ such that $t_i$ is a conjunction of $k_i$ literals. Let $x_j$ denote the $j^{th}$ literal in $t_i$, where $1 \leq j \leq k_i$. $S_{T,OP}$ is defined as the faulty implementation of $S$, where $F$ is the name of the fault class and $E$ identifies the exact fault in $S$. Definition 2.1 formally describes a fault type and Table 1 lists the nine different fault classes.

**Definition 3.1.** Term Omission Fault (TOF)

If a term $t_i$ is omitted from $S$ then $S_{T,OP} = t_1 + \ldots + t_{i-1} + t_{i+1} + \ldots + t_m$ where $1 \leq i \leq m$

Coverage Metrics: We define the three structural coverage criteria, namely statement, branch and MCDC, for which tests are generated in our framework.

To achieve statement coverage, a test suite should invoke every executable statement in the code. Branch coverage mandates that the test suite should execute both the true and false outcomes of all decisions (e.g., if statements) in the code. This metric is stronger than statement coverage, but is still weak due to the presence of conditions with in a decision. A condition is defined as a boolean expression with no boolean operators. A decision is defined as a boolean expression with zero or more boolean operators. For example, a decision $(A > 10 \text{ or } B)$ has two conditions $(A > 10)$ and $B$.

MCDC [6] was developed to test the effect of conditions within a decision. MCDC requires that each condition with in a decision should independently affect the outcome of the decision. This coverage ensures that the effect of each condition is tested relative to other conditions within the decision. This implies that no condition is left untested due to logical masking. For a decision with $m$ conditions, MCDC requires $m+1$ test cases [6]. It is difficult to attain when compared to statement and branch. For example, consider a decision $d$ with three conditions $(A < 0)$, $(B > 0)$ and $(C = 0)$, the test cases generated are shown below:

**Example:**

\[
d = (A > 0) \land B \lor (C = 0)
\]

Test cases generated for MCDC:

- **case A > 0:** $T_1 = [0, 1, 0]$ (d=0) and $T_2 = [1, 1, 0]$ (d=1)
- **case B:** $T_3 = [1, 0, 0]$ (d=0) and $T_4 = [1, 1, 0]$ (d=1)
- **case C = 0:** $T_5 = [0, 1, 0]$ (d=0) and $T_6 = [0, 0, 1]$ (d=1)

Test suite: $\{ T_1, T_2, T_3, T_4 \}$

4. Motivation

Velev [13] and Campenhout [2] conducted empirical studies on error/bug categories observed in academic microprocessor projects. The bug categories identified were very similar to the faults observed during validation of software systems [6]. The faults were analyzed and an extended fault model comprising of nine types of faults on boolean expressions was proposed by Lau et al. shown in Table 1. Being able to detect these basic fault classes would result in identifying more complex faults, which follows from the fault coupling effect hypothesis [7]. We correlate the bug categories in [2,13] to the extended fault model in [7] and motivate the need for a test generation framework that provides coverage for the fault classes.

In [2], Campenhout et al. analyzed modeling errors commonly seen in microprocessors and generated error distributions. We segregated the errors related to signals and gates in the control logic of the processor into three groups: (i) wrong, (ii) missing and (iii) extra. The study reported 33% of the errors occur due to wrong signal source and 4.9% due to wrong gate type. The missing gate errors accounted to 7.4% and missing input(s) to 11.5%. The errors from extra inversion contributed 1.6%. We relate these errors to the fault classes based on omission, insertion and incorrect reference (shown in Table 1) and summarize the correlation in Table 2. Note that the exact description of the modeling error were not provided and certain assumptions were made to arrive at the correlation.

**Table 2. Correlation Study**

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>ENF</td>
<td>54.8% (extra)</td>
<td>56.8% (159/280)</td>
</tr>
<tr>
<td>TOF</td>
<td>7.4% (missing)</td>
<td>11.4% (32/280)</td>
</tr>
<tr>
<td>LOF</td>
<td>54.8% (wrong)</td>
<td>26.9% (74/280)</td>
</tr>
<tr>
<td>LIF</td>
<td>9.4% (wrong)</td>
<td>0.4% (1/280)</td>
</tr>
</tbody>
</table>

Velev [13] studied three microprocessors to perform a bug col-
lection in pipelined and superscaler designs and arrived at the following bug categories: (i) incorrect control logic, (ii) incorrect connections, (iii) incorrect design for formal verification and (iv) incorrect symbolic simulation. His work identified 280 bugs and presented a neat description of each bug, which was subjected to our analysis. We found that 95% of the bugs identified were subsumed by the fault classification in [7] and the remaining 5% were artifacts of the formal verification framework in [13]. Our test generation is directed to detect all the design faults except LIF, since none of the errors observed was classified as a LIF.

5. Modeling & Test Generation

The modeling framework facilitates microprocessor modeling through an abstract representation called the metamodel [1]. The metamodel defines the syntax and semantic for the language in which the modeler describes the architecture and microarchitecture of the processor.

Architecture Modeling. The processor is specified through a register/memory-level description (registers, their whole-part relationship and memory schematic) and an instruction set capture (instruction-behavior). The metamodel for architectural modeling provides the user with constructs to describe function blocks, if-else blocks, statement sequences and loops as well as the register map and memory layout. On the left of Figure 2, we show the jump-if-not-zero instruction (JNZ) that positions the instruction pointer (IP) to the location specified in source (src) when the zero flag (ZF) is not set. The IP and ZF are references to registers modeled as a part of the architecture. The register map specifies all the registers as well as their whole-part relationship as shown on the right in Figure 2, whereas the memory schematic describes the memory hierarchy. The src is an identifier that translates to a register/memory reference or an immediate value during decoding.

Function block: JNZ
Identifier: src
Register reference: IP
Register reference: ZF
If block:
Guard:
Relational Statement: ZF = 0
Action:
Arithmetic Statement: IP = IP + src

Figure 2. Architectural Modeling

Microarchitecture Modeling. The language allows the modeler to instantiate and configure different pipe stages, provide ports to interface with the memory and register file, insert instruction registers and describe the control logic needed to perform data forwarding and stalling. It provides a set of basic configurable components such as MUXes, ALUs, INCs, etc that can be combined to create complex stages that interact with each other as well as with the architectural entities through access ports. In Figure 3, we show the update of the program counter in the instruction fetch stage (IF) of the VESPA processor [8] modeled using our constructs. The selection criteria of the MUXes are not shown, since they map to statements similar to the ones in architectural modeling. PC and PC2 are program counters, where PC belongs to IF and PC2 is an input to instruction decode (ID) stage. PC2 is updated with the PC or previous PC2 value depending on whether the pipeline is stalled or not. PC is updated with PC+4 (location of the next instruction), the previous PC/PC2 (pipeline stall) or a new address at port Z (branch instruction). The value at Z is obtained from the EX stage.

//PC Multiplexing
MUXx: PC, MUX
PC,MUX_op: Z, PC, PC+4, PC2
PC,MUX_op: PC
/Update PC2 after incrementing PC
INC, INC
INC, INC
INC, INC
INC
INC
INC
INC
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
MUX
M
criteria is valid in that run. The PFG for the CNJ instruction and the PC behavior of the fetch stage of VESPA is shown in Figure 5.

![Diagram of Program Flow Graph](image)

**Figure 5.** Program Flow Graph

For a graph with a unique path (no branches), the behavior is converted into CSP in a straightforward manner by generating constraint variables and assignment constraints. However for a model with multiple paths, the graph undergoes Static Single Assignment (SSA) analysis [10] to construct the correct set model with multiple paths, the graph generated undergoes Static constraint variables and assignment constraints. However for a graph with multiple paths, the graph generated undergoes Static constraint variables and assignment constraints. The PFG for the JNZ instruction and the current value of the PC behavior of the fetch stage of VESPA is shown in Figure 5.

![Diagram of SSA analysis](image)

**Figure 6.** SSA analysis outcome of the PFG in Figure 5.b

Consider the decision $D_3$ and state $S_3$, which does not change the current value of $pc$ but forwards the previous value ($pc_1 = pc_0$). Omission of $S_3$ results in a contention at $S_3$, and multiple paths, where most of them update the $pc$ register except for one ($S_3$). Therefore, a parent resolution is performed that inserts additional decisions and variables as shown on the right in Figure 6. The SSA is non-trivial when considering nested-ifs and loop unrolling.

The CSP formulation guidelines are outlined below.

**Algorithm 1: CSP Formulation**

- **Step 1** If a state assigns a new value to a variable, a new constraint variable is generated for that variable (renaming act in SSA) and future references to the variable refer to the newly generated constraint variable.
- **Step 2** If a state assigns multiple new values for a program variable, a new constraint variable is generated for each decision that results in a value, the corresponding conditional constraint is generated such that the new variable can be assigned that value (parent resolution in SSA).

The language used to print out the CSP consist of assignment and conditional constraints. The CSPs developed for the above examples are shown in Figure 7. On the left, the architectural constraints for the CSP generated from the CNJ instruction are shown and on the right the microarchitectural constraints for the CSP generated from the PC behavior in the IF stage are shown.

**Test case Generator (TCG).** The CSP is given as input to the constraint solver in the TCG. The constraint solver used in our framework is ILOG, which is a commercial solver designed for performance and scalability. The solution generated by the solver consists of a possible value for every constraint variable in the CSP. Examples of some possible solutions generated for Figure 7 are shown in Figure 8. Test case $T_1$ causes the IF stage to fetch a branch instruction from #1100 and $T_4$ causes both IF and ID stages of the pipeline to be stalled.

**Figure 7.** CSP formulation for Figure 5.a & 6 (right side)

The binary generator creates a program binary by combining the model with the registers and memory locations initialized to the values stored in the input constraint variables obtained from the test case. A program binary for the PC's behavior initializes $Z$, PC and $PC2$ to the values of $zo$, $pc_0$, and $pc_2$ respectively. The output constraints $pc_3$ and $pc_2$ are the results obtained by executing the test case against the reference model.

**Usage Mode.** There are two modes of usage for the test generation framework: (i) Model-random and (ii) Coverage-directed. A test suite in the model-random mode is generated by converting the model into a CSP and solving it multiple times through the solver. Attaining a coverage goal in this mode requires a large number of test cases. However, validators are interested in design-fault based testing, the test cases for which are difficult to obtain using the model-random approach. As a result, the framework also provides a coverage-directed approach that generates one possible test suite.
which attains 100% coverage of a specific goal. In this mode, coverage constraint are given as input to the solver that directs the test generation towards the goal as shown in Figure 4.

6. Coverage Constraints

The coverage constraint generator (CCG) takes two set of inputs besides the program flow graph. Firstly, the coverage annotations that the modeler provides to expedite the reachability of the coverage goal. Secondly, the coverage type based on which the additional constraints are generated. The coverage hints are embedded into the flow graph for certain coverage types and for the others specialized constructs that serves as collectors are provided. The annotations are done by setting the attribute MARK associated with a state and decision. The coverage types supported are: (i) Statement Coverage, (ii) Branch Coverage, (iii) MCDC and (iv) Design Fault Coverage. The test suite created for statement- and branch-directed test generation is a subset of the test suite created for MCDC. Therefore, we explain how the coverage constraints for MCDC are generated.

Algorithm 2: MCDC Constraint Generation

\[
\begin{align*}
\text{collect}_{\text{addr}}(G) &= \text{collect}_{\text{addr}}(G) \cup \text{collect}_{\text{addr}}(G) \\
\text{collect}_{\text{addr}}(G) &= \text{collect}_{\text{addr}}(G) \cup \text{collect}_{\text{addr}}(G) \\
\end{align*}
\]

MCDC: The objective of MCDC is that every condition within a decision should independently affect the outcome of the decision. The CCG works off the flow graph and generates the constraints necessary to attain the objective using Algorithm 2. The validator is allowed to tag the problematic decision points and the CCG collects them using \text{collect}_{\text{addr}}. It then orders these collected nodes using \text{order}_{\text{addr}} based on their depth in the graph before generating the constraints. For every collected decision, the CCG identifies a path that reaches it using \text{find_path} and generates the constraints for the path as well as the additional MCDC constraints for that decision. The constraints for the path is generated using the \text{gen_constraint} function. For a decision that is a condition, branch and MCDC generates identical test suites. However, for a decision with one or more boolean operators found using \text{check boolean_op}, the additional constraints are generated to tune the evaluation of the decision to the condition’s effect. For each condition, the decision should evaluate to true and false based on whether it is assigned a ‘0’ or a ‘1’. This results in \(m\) test cases for the 0-value scenario and another \(m\) for the 1-value scenario. After removal of the duplicate test cases using \text{rm duplicate} MCDC results in \(m + 1\) test cases. To perform MCDC on a decision \(d\) with \(m\) conditions, w.r.t the \(j^{th}\) condition, CCG generates constraints for both the 0 and 1-value scenarios using Function 1.

\[
\begin{align*}
\text{Function 1: MCDC\_cov}(c_{j}, d, \text{val}) &= \text{copy}(d, 0, i-1) \\
&\cup \text{copy}(d, i, 0) \\
&\cup \text{copy}(d, i+1, m) \\
\end{align*}
\]

\(\text{Function 2}\): MCDC\_cov copies \(d\) into \(d'\) and renames the conditions in \(d'\). Then, it generates constraints that assert \(d'\) to \(val\) and \(d'\) to \(\neg\text{val}\) by the \text{cr_constraint} function. It also enforces that every condition in \(d\) and \(d'\) have identical values except for the condition \(c_{j}\) in \(d\) and its corresponding copy in \(d'\) using the same function.

Design Fault Coverage: The design fault classes are provided in Table 1. To attain this coverage goal, test cases that cover each of these fault class are generated by the CCG. Our probabilistic analysis of the fault detection capability of MCDC revealed that seven of the nine fault classes are covered by the tool suite generated as result of MCDC-directed test generation.

Consider the literal omission fault class (LOF), MCDC detects this fault because to satisfy this coverage, every literal in the decision has to affect the output of the decision. Therefore a pair of test cases that causes the missing literal to affect the decision’s output in the reference model exist, which is missing in the implementation. For at least one of these test cases, the output of the decision in the implementation will differ from the fault-free decision. Therefore, MCDC-directed test generation finds the bug with a probability 1. The only fault class not covered by MCDC with a probability 1 is literal reference fault, for which CCG produces additional constraints that guide the TCG to detect it.

Example:

\(d = A \land B \lor C\) (fault-free decision), \(d' = A \land B\) (LOF decision)

Test cases generated:

\[
\begin{align*}
&\text{case } A : [0, 1, 0] (d=0) \text{ and } [1, 1, 0] (d=1) \\
&\text{case } B : [1, 0, 0] (d=0) \text{ and } [1, 1, 0] (d=1) \\
&\text{case } C : [0, 0, 1] (d=0) \text{ and } [0, 1, 1] (d=1) \\
&\text{test case } [0, 1, 1] \text{ executed on } d' \text{ results in } d' < d'
\end{align*}
\]

Literal Reference Fault Coverage: The annotations to attain this coverage are captured using the LRF\_list construct, which allows to create a list of variables. The validator associates a LRF\_list with every problematic variable \(v\) that tells CCG that \(v\) can be replaced by an instance of any variable in the given LRF\_list, which results in a fault. If a LRF\_list is not provided then, the CCG assumes that every variable can be replaced incorrectly by every other variable, which results in a large set of test cases. Therefore, the validator is recommended to specify a possible LRF\_list with a variable and mark that variable, such that CCG can take advantage it. The constraints generation for LRF coverage is shown in Algorithm 3.

Algorithm 3: LRF Constraint Generation

\[
\begin{align*}
\text{Algorithm 3: LRF Constraint Generation} \\
&\text{\text{Function 1: MCDC\_cov}(c_{j}, d, \text{val})} \\
&\text{\text{Function 2} MCDC\_cov copies \(d\) into \(d'\) and renames the conditions in \(d'\). Then, it generates constraints that assert \(d'\) to \(val\) and \(d'\) to \(\neg\text{val}\) by the \text{cr_constraint} function. It also enforces that every condition in \(d\) and \(d'\) have identical values except for the condition \(c_{j}\) in \(d\) and its corresponding copy in \(d'\) using the same function.}
\end{align*}
\]

After specifying the coverage type and inserting the coverages
hists, the CCG generates the coverage constraints. These are given as additional constraints to the TCG to generate the test suite that achieves the coverage goal.

7. Results

We applied our test generation methodology on VESPA [8], a 32-bit pipelined RISC microprocessor to evaluate our framework. The microarchitecture of VESPA was modeled using our metamodel-based language and converted into a program flow graph from which a CSP was formulated. Additional constraints for each coverage metric were generated using the coverage constraint generator. These coverage constraints along with the CSP are resolved through a solver to generate the coverage-specific test suite. Model-random test suites were generated by asserting the statements and decisions in the model with uniform probability.

The model of the microprocessor consisted of 330 unique decisions and 400 unique conditions, where the number of conditions in a decision varied between 1 and 10. These decisions translate to boolean expressions in the HDL implementation of VESPA and are prone to modeling errors. Being able to generate tests that detect these errors improves the quality of functional validation.

To evaluate our framework, we inserted modeling errors into the model from the fault classes described in [7]. Seven modeling bugs \((LOF, LNF, TOF, TNF, ENF, ORF\[+]\) and \(ORF[-]\)) were inserted into each of the 330 decisions in the model. For \(LRF\), 50 decisions were randomly chosen and the variables in the decisions were randomly replaced with other variables in the model. For each inserted bug, we compute the test-set that would detect the bug. Next, we compared the test cases in the test suite generated for model-random, statement, branch and MCDC with the set of test cases that detect each inserted bug (the intersection of the two test suite). The quality of different coverage metrics is measured by the number of distinct bugs detected by a test suite generated to satisfy a coverage metric. In order to do a fair comparison, the number of test cases in the test suites being compared were adjusted to be the same. Table 3 shows the percentage of bugs detected by each of the test suites.

<table>
<thead>
<tr>
<th>S.No</th>
<th>Fault</th>
<th>Random (%)</th>
<th>Stmt (%)</th>
<th>Branch (%)</th>
<th>MCDC (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>LOF</td>
<td>92.7</td>
<td>28.9</td>
<td>32.7</td>
<td>100</td>
</tr>
<tr>
<td>2</td>
<td>LNF</td>
<td>99.9</td>
<td>79.9</td>
<td>80.2</td>
<td>100</td>
</tr>
<tr>
<td>3</td>
<td>TOF</td>
<td>92</td>
<td>32</td>
<td>36</td>
<td>100</td>
</tr>
<tr>
<td>4</td>
<td>TNF</td>
<td>100</td>
<td>76.4</td>
<td>77.6</td>
<td>100</td>
</tr>
<tr>
<td>5</td>
<td>ENF</td>
<td>100</td>
<td>100</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>6</td>
<td>ORF[+]</td>
<td>96</td>
<td>32</td>
<td>36</td>
<td>100</td>
</tr>
<tr>
<td>7</td>
<td>ORF[-]</td>
<td>98</td>
<td>66.6</td>
<td>61.7</td>
<td>100</td>
</tr>
<tr>
<td>8</td>
<td>LRF</td>
<td>92.5</td>
<td>74.5</td>
<td>95</td>
<td>95</td>
</tr>
</tbody>
</table>

It is seen that MCDC is uniformly better in detecting each of the design faults when compared to statement and branch. MCDC detects 7 of the 8 fault classes 100% of the time. Statement and branch are able to cover some of the fault classes completely and for some they perform badly. Therefore, these metrics are not stable and weaker than MCDC. This illustration supports our proposal to use MCDC as a coverage goal for microprocessor validation. Model-random on the other hand performs better than statement and branch because random tests generated in our framework can assert multiple decisions and statements simultaneously resulting in a higher fault coverage. However, model-random test generation fails to achieve 100% coverage of design faults such as \(LOF\) and \(TOF\), which are commonly seen bugs in microprocessor validation as shown in Table 2.

8. Conclusion And Future Work

Simulation-based validation is widely acknowledged as a major bottleneck in current and future microprocessor design due to the increasing complexity. A possible solution is to perform design error-based test generation from a high level description of the microprocessor. Our methodology made two important contributions. Firstly, a metamodel-based modeling framework was developed that captures the structure and behavior of the architecture (register file and ISA) and the microarchitecture (pipeline structure, control and data forwarding logic). CSP formulation of these models resulted in test suites that covered design fault commonly seen during microprocessor validation therefore, illustrating our coverage directed test generation approach. Secondly, we show a high correlation between the design faults proposed by software validation and modeling errors/bugs observed during the study microprocessors validation.

Our future work includes extending the test generation framework to perform exception coverage, event coverage and sequencing coverage for instructions. We will also perform empirical evaluation of the bug finding capability of different coverage metrics as well as the stability of different coverage metrics across test suites.

9. References