CPU Verification Challenges

Moore’s Law is slowing, and as a result, modern CPU designs have become more susceptible to superbugs. As seen in Fig. 1, the number of transistors is still growing at a rate of 50% per year, but the switching speed of the transistors has leveled off and CPU clock frequency has reached a practical maximum. Single-thread performance improvement is slowing down and improving at a rate of only 10% per year. These performance gains are no longer due to faster clock rates, but rather to advanced architectural design features which improve performance at the cost of increased complexity.

Figure 1: Microprocessor trend data

Below are some examples of the performance enhancing features which have lead to increased vulnerability to superbugs:

Deep Pipelining
As the instruction pipeline depth is increased, the complexity of each stage can be reduced, thus allowing the processor to be clocked at a higher rate. However, due to an increased number of inter-dependent steps, the overall complexity increases.

Data Forwarding
When data from the result of an earlier instruction is needed in a subsequent instruction, the result is fed back to the input of the execution unit that needs it, before it is written back to the register file. This avoids pipeline stalls and boosts performance at the expense of increased complexity.

Out-of-order Execution
Out-of-order (OOO) processors reduce pipeline stalls that occur when the data needed for an instruction is going to take a long time to retrieve, such as during a cache miss, by allowing subsequent instructions that do have their operands available to be executed. Complex circuitry is needed to reorder the instructions for execution and then commit the results in program order.

Branch Prediction and Speculative Execution
Branch predictor logic improves the instruction pipeline flow by guessing whether a conditional jump is likely to be taken or not. If the condition takes many cycles to evaluate, this avoids a lengthy pipeline stall, however it comes at the expense of increased circuit complexity required to store and rewind the instructions when the guess is wrong.

Multiple Issue and Superscalar
Multiple Issue processors execute two or more instructions in parallel. This creates duplicate and more complex hardware that is required to check for data dependencies between multiple instructions.

Instruction Fusion
Micro-operations from the same or different instructions can be merged into a single micro-operation to improve pipeline throughput. This feat can only be accomplished by using smarter, more complex decode circuitry that recognizes the micro-operations that should be fused.

CPU Blocks with Simulation-Resistant Superbugs

Many functional units in a CPU are commonly found to contain simulation resistant superbugs, such as:

Instruction fetch queue

OOO functions: register rename, reorder buffer and retirement

Instruction scheduler

Execution units

Load-Store unit

L2 cache

Coherency manager

These types of blocks have too many combinations to test in simulation and to completely cover all temporal relations between events. Take the example of a load-store unit. It must handle a wide range of possible operations including loads, stores, fills, snoops and flushes. To rid the design of superbugs, verification must cover the cross of all possible temporal occurrences of events with all possible states of the design, including handling special conditions such as store-to-load forwarding and clearing speculative data that is a result of stores or loads from a mis-predicted branch.

Oski Formal for CPUs

Oski’s Formal Sign-Off Methodology enables exhaustive analysis of all possible design states. Oski has developed the expertise required to anticipate where the CPU superbugs are most likely to occur and to know how to flush them out. Contact Oski to learn more.