Networking Verification Challenges

Networking chips, such as network processors (NPUs) and switches, are large scale, complex devices, which provide high-bandwidth, low latency network traffic routing functions. Networking equipment typically has a requirement for high reliability and quality, which makes eradicating latent superbugs from these devices very important.

The rising demand for cloud computing applications and network virtualization mechanisms has caused a surge of growth in Software Defined Networking (SDN) protocols. The silicon devices in datacenter switches must be engineered with the flexibility and configurability to support a wide range of networking protocols. These configurable designs have far too many combinations of settings to practically test with traditional verification methods.

The typical architecture of a networking switch (Fig. 1) includes high-speed interface line interfaces (e.g. Ethernet), ingress and egress packet processing pipelines and a central packet buffering and scheduling unit that provides the main control of data flow through the device. There are a number of verification challenges that are presented by each of the main components of these devices.

Architecture of a networking device

Figure 1: Architecture of a networking device

Port interfaces
Port interfaces must conform exactly to the standard protocols to ensure that networking equipment is interoperable between vendors. This is especially challenging for new and complex protocols such as 400G Ethernet.

Forwarding pipelines
Complexity in the forwarding pipelines arises because the designs must handle a wide range of packet sizes from as few as 64bytes up to thousands of bytes, arriving with arbitrary alignment. Multiple types of packet headers must be parsed (e.g. IPV4, IPV6) and complex look-ups performed to determine the routing information. In addition, large counters are involved to implement metering and check timeout conditions, which adds to verification complexity because of the rarity of hitting special conditions such as counter roll-over states, where superbugs can be hiding.

Packet queuing and scheduling
Networking devices use very wide data paths with hundreds of bits that must be enqueued and dequeued every cycle. This adds to the memory system complexity since a very high bandwidth memory must be composed of a complex system of multiple distributed memory banks.

Scheduling of packets involves complex arbitration and metering to ensure correct bandwidth allocation for the ports of the device. For bandwidth utilization to minimize switch latency, multiple packets may be packed together into the internal data path words and scheduling logic must ensure no gaps in the flow of data when data is available to be forwarded, which adds to complexity.

Schedulers must also support multicasting, which enables a single incoming packet to be replicated for as many as all output ports. Subscriber bandwidth allocation can also vary over a wide range, which adds to complexity.

Networking Blocks with Simulation-Resistant Superbugs

Many functional units in a networking device are commonly found to contain simulation-resistant superbugs, such as:

Forwarding engines

Linked-list controllers

Buffer managers

Crossbar switches

Packet parsers

Quality of Service (QoS) units

Framers

These types of blocks have too many combinations to test in simulation and to completely cover all temporal relations between events.

Oski Formal for Networking

Oski’s Formal Sign-Off Methodology enables exhaustive analysis of all possible design states. Oski has developed the expertise required to anticipate where the networking silicon superbugs are most likely to occur and to know how to flush them out. Contact Oski to learn more.