Understanding “Output Errors” and “Stomped CRC” on Cisco ACI Leaf–Spine Links

Mukesh Chanderia
Jan 4
4 min read

Introduction

One of the most misunderstood interface statistics in Cisco ACI and Nexus-based fabrics is the presence of “output errors” on leaf–spine links, often accompanied by “stomped CRC” or input errors on the peer interface.

At first glance, this can appear alarming—especially when:

Host-facing ports show zero CRC or input errors
Fabric links are stable for years
Traffic volumes are extremely high
Errors occur only occasionally, not continuously

Common Scenario Observed in the Field

Typical observations engineers encounter:

Leaf–spine interfaces show:
- output error counters incrementing slowly over time
The corresponding peer interface shows:
- stomped CRC and input error
Host-facing ports:
- Clean counters (0 CRC, 0 input errors)
No link flaps, no reliability degradation
Counters have never been cleared and span several years

This often triggers questions such as:

“If hosts are clean, where did these errors come from?”
“Why does the switch corrupt frames instead of simply dropping them?”
“Is this a hardware or cabling issue?”

Let’s answer these step by step.

Internal Packet Flow in a Cisco ACI Leaf

Before interpreting counters, it’s critical to understand where drops can occur.

Simplified data path inside a leaf:

Host Port
   ↓
Ingress ASIC (frame accepted cleanly)
   ↓
Internal fabric / queues / buffers
   ↓
Egress ASIC (leaf–spine port)
   ↓
Spine

Key takeaway:

Errors seen on leaf–spine ports may originate after the packet has already been accepted from the host.

Why Host-Facing Ports Stay Clean ?

Host-facing interfaces showing:

0 CRC
0 input errors

means:

Frames arrived correctly from servers
No physical or L2 integrity issues on host links
No drops occurred at ingress

If a packet is later dropped inside the leaf’s egress pipeline, it will:

Never return to the host
Never increment host-facing counters
Be invisible to server NIC statistics

This is expected behavior.

What “Stomped CRC” Actually Means ?

Normal CRC Error

Frame arrives corrupted on the wire
Usually caused by cable, optics, or physical issues

Stomped CRC (Very Different)

The ASIC intentionally corrupts the frame’s CRC
Done before transmission
The frame is sent with a bad FCS so the next hop discards it

This is not random corruption.

It is a controlled hardware mechanism used in specific scenarios.

Why Would an ASIC Do This?

In Cisco Nexus / ACI platforms, the ASIC may use CRC stomping instead of a silent drop in certain conditions:

1. Egress Congestion or Microbursts

Multiple ingress flows converge on a single 40G uplink
Short bursts exceed egress queue capacity
Some packets must be dropped

Instead of silently discarding:

ASIC marks the frame invalid
Sends it out with a stomped CRC
Peer detects and discards it

2. Specific Egress Pipeline Conditions

Certain internal error-handling paths
Some QoS or buffer-management scenarios
Platform-dependent behaviors

3. Accounting and Visibility

By stomping CRC:

Transmitting side increments output error
Receiving side increments input error / stomped CRC
Both sides “agree” a packet was lost

This provides end-to-end accounting, even though the drop decision was made upstream.

Why You Don’t Always See “Output Discards” ?

A common misconception:

“If the switch drops traffic, it must increment discard counters.”

In reality:

Discard counters reflect internal silent drops
Output errors can reflect frames invalidated on egress
Stomped CRC reflects intentional downstream discard

These are different drop accounting mechanisms.

From an end-to-end perspective, the packet is still dropped—just recorded differently.

Correlation Pattern to Look For

Leaf TX output errors ≈ Spine RX stomped CRC

When you see:

Matching or near-matching values on both sides
No CRC/runts/giants
Stable links

This strongly indicates:

Intentional ASIC discard behavior
Not a physical-layer fault

Error Rate Matters More Than Absolute Numbers

Always normalize errors against traffic volume.

Typical real-world example:

Tens of trillions of packets transmitted
Tens of thousands of output errors
Spread over multiple years

This translates to:

Error ratios on the order of 10⁻⁹
Well within acceptable operational tolerance
Invisible to applications due to TCP retransmissions

Why This Happens “Sometimes, Not Every Day” ?

This intermittent nature is a crucial clue:

Microbursts are traffic-pattern dependent
Rare congestion events occur during peaks
Control-plane events, maintenance windows, or transient bursts can contribute
No continuous increase = no persistent fault

If this were a physical issue:

CRC errors would be continuous
Error rate would scale with traffic
Reliability would degrade
Links would flap or reset

None of that is observed.

How To Validate ?

Recommended steps:

1. Monitor Error Delta

Capture counters
Recheck after several hours/days
Confirm errors are not rapidly increasing

2. Check Queue / Congestion Indicators

Look for egress queue drops
Validate oversubscription ratios

3. Verify Optics Health

DOM values (Tx/Rx power, temperature)
Ensure within vendor specs

4. Software Context

Check platform and release notes
Identify known cosmetic or accounting behaviors

Conclusion

When all of the following are true:

No CRC/runts/giants
Stable links over long periods
Extremely low error rate
Symmetric output error ↔ stomped CRC pattern
Clean host-facing interfaces

Then:

The observed output errors result from rare, intentional packet discards performed by the ASIC, most commonly during brief congestion events. They do not indicate a hardware defect, cabling problem, or misconfiguration. This behavior is normal, well-documented, and safe in high-throughput Cisco ACI fabrics