top of page

BFD (Bidirectional Forwarding Detection) in ACI

  • Writer: Mukesh Chanderia
    Mukesh Chanderia
  • Jan 1
  • 5 min read

Updated: 19 hours ago

What is BFD?

  • Bidirectional Forwarding Detection (BFD) is a network protocol that swiftly identifies failures in the forwarding path between two devices, such as routers or switches.

  • It enables rapid detection of faults, often within milliseconds (sub-second), enhancing network reliability by reducing downtime.


When to Use BFD:

  1. Indirect Connections:

    • In scenarios where routers are connected through a Layer 2 device or cloud and cannot directly detect each other’s failures, BFD offers quick failure detection, bypassing the longer timeouts of traditional protocols.​

  2. Unreliable Media:

    • For connections over media lacking reliable failure detection mechanisms, like shared Ethernet, BFD provides swift detection, ensuring timely responses to issues.​

  3. Multiple Protocols:

    • When multiple protocols operate between two routers, each with its own failure detection timers, BFD standardizes detection times, leading to consistent and predictable network behavior.​


BFD in Cisco ACI:


  • Monitoring Spine-to-Leaf Connections:

    • BFD rapidly identifies failures in critical ACI fabric links between spine and leaf switches, maintaining network stability.​

  • Enhancing Routing Protocols:

    • By integrating with protocols like OSPF, BGP, and static routes, BFD accelerates network convergence during failures, minimizing downtime.​

  • Ensuring Application Availability:

    • BFD helps prevent application downtime by enabling swift rerouting of traffic in response to network issues.​


Configuration Steps for Fabric BFD


  1. Enable Global BFD:

    • Navigate to: Fabric > Fabric Policies > Policies > Interface > L3 Interface > default.​

    • Enable the BFD ISIS Policy configuration.​

    • Verify neighbors using the command:​


      Spine/Leaf# show bfd neighbors vrf overlay-1


  2. Modify Global BFD Timer:

    • Navigate to: Fabric > Access Policies > Policies > Switch > BFD > BFD IPv4.​

    • Create or edit a BFD policy with desired timers.​

    • Check neighbor details using:​


      Spine/Leaf# show bfd ipv4 neighbors details


    Enable Interface-level BFD:


    • Important: Disable the global BFD setting first.​

    • Create a new L3 Interface Policy (e.g., "NP") and enable the BFD ISIS Policy configuration within it.​


    • Assign this new L3 Interface Policy to the appropriate Policy Groups:​

      • Leaf Interface Policy Group: Create a new policy group (e.g., "LNPG") and attach the "NP" L3 interface policy.

      • Spine Interface Policy Group: Create a new policy group (e.g., "SNPG") and attach the "NP" L3 interface policy.


    • Associate these Policy Groups with the correct Interface Profiles:​

      • Leaf Interface Profile: Create a new profile and attach the "LNPG" policy group.

      • Spine Interface Profile: Create a new profile and attach the "SNPG" policy group.

    • Ensure that the relevant switch profiles utilize "LNPG" and "SNPG" to activate Fabric BFD on those interfaces.​


Guidelines and Limitations:


  • Supported Features:

    • Starting from APIC Release 3.1(1), BFD supports IS-IS on fabric interfaces between leaf and spine switches. Additionally, BFD is supported on spine switches for OSPF and static routes.​

    • BFD is compatible with modular spine switches equipped with -EX and -FX line cards (or newer versions), as well as the Nexus 9364C non-modular spine switch (or newer versions).​

    • From APIC Release 5.0(1), BFD multihop is supported on leaf switches, and ACI supports C-bit-aware BFD, determining whether BFD is dependent or independent of the control plane.​


  • Limitations:


    • BFD between vPC peers is not supported.​

    • BFD over iBGP is not supported for loopback address peers.​

    • BFD on Layer 3 Outs (L3Out) is supported only on routed interfaces, subinterfaces, and SVIs; it is not supported on loopback interfaces.​

    • BFD for BGP prefix peers (dynamic neighbors) is not supported.​

    • Enabling BFD subinterface optimization on one subinterface activates it for all subinterfaces on the same physical interface.


Enabling BFD on L3Out:


In Cisco ACI deployments, Bidirectional Forwarding Detection (BFD) is commonly used alongside BGP on L3Out connections to achieve fast routing convergence.


However, when BFD timers are not tuned appropriately for external peers, it may run into unexpected BFD session flaps, even when the underlying physical links are perfectly healthy.


  • To enable BFD on an L3Out:​

    • Check the BFD option within the Logical Interface Profile under the respective routing protocol (BGP, OSPF, or EIGRP).​

    • By default, BFD parameters are derived from the global default BFD policy located at: Fabric > Access Policies > Policies > Switch > BFD > BFD IPv4/v6 > default.​

    • For custom BFD settings, create a non-default BFD policy and apply it to specific switches via the Switch Policy Group and Switch Profile under Fabric > Access Policies > Switches.​

    • To override switch-level global BFD parameters at the interface level, create a BFD Interface Profile under the Logical Interface Profile. This interface-level BFD policy is located under: Tenant > Policies > Protocol > BFD.


How to Apply in APIC


Tenant

→ Networking

→ Policies

→ Protocol

→ BFD

→ BFD Interface Policy (Add)


Then attach the policy to your L3Out interface:


Tenant

→ L3Out

→ Node Profile

→ Interface Profile

→ Interface

→ Routing Protocol Policy → BFD Policy


Recommended ACI BFD Policy for L3Out


Parameter

Value

Why it works

Tx Interval

300 ms

Prevents false flaps during microbursts

Rx Interval

300 ms

Matches peer expectation

Multiplier

10

Allows ~3 seconds of tolerance

Echo Mode

Optional

Helps with sub-second detection if both sides support it

This gives enough stability without sacrificing convergence speed.


Leaf# show bfd neighbors vrf all details


MinTxInt: 300000 us

MinRxInt: 300000 us

Multiplier: 10

State: Up


Understanding BFD-Driven Link Flaps in Cisco ACI


Link flaps involving port-channels in Cisco ACI can sometimes appear to be physical issues, but in many real-world cases, the true trigger lies within control-plane protocols—most commonly BFD (Bidirectional Forwarding Detection).


In high-speed fabrics (40/100G), BFD is used extensively to provide rapid failure detection between ACI leaf switches and external routers. BFD timers can be configured aggressively to achieve sub-second convergence.However, excessively aggressive timers—especially the commonly used:

  • 50 ms Hello

  • 150 ms Detection Time (Multiplier 3)

may lead to false-positive BFD down events, causing unnecessary port-channel resets even when the physical link is perfectly healthy.


Symptom Description

Typical indicators that BFD is the trigger (rather than optics or cabling) include:

  • Port-channels transitioning down with error:MINLINK_REQ_NOT_MET_DUE_TO_BFD_CHANNEL_REINIT

  • Interfaces appear physically UP but flap briefly

  • No CRC, no errors, no LOS logged on ACI side

  • BFD sessions reset at the exact timestamp of the flap

  • Recovery happens within ~30 seconds


Confirming Whether BFD or Physical Layer Dropped First


To determine this, we correlate:

  • ACI FSM transitions

  • ACI event logs

  • Peer router logs (IOS-XR, NX-OS, ASA, etc.)

  • BFD session histories


Evidence: ACI Port FSM Timestamps

ACI’s internal FSM logs are the most authoritative source.

Example FSM sequence from affected interface:


ETH_PORT_FSM_EV_PROTO_DOWN ← BFD protocol down FIRST

ETH_PORT_FSM_EV_PHY_DOWN ← Physical interface down SECOND


This ordering clearly shows that:

BFD session timed out before the physical interface dropped.

This is the hallmark of aggressive BFD timers, not optical degradation.


ACI Syslog Confirmation

Matching syslog entries typically show:


%ETH_PORT_CHANNEL-3-MINLINK_REQ_NOT_MET_DUE_TO_BFD_CHANNEL_REINIT


This log message directly states that BFD state caused the port-channel reinitialization—not the physical link.


Log excerpt from ACI event-history:


"Control Detection Time Expired"


This is the direct indicator that BFD expired before receiving enough consecutive Hellos—even though the physical link may have been stable.


Understanding the Root Cause – Aggressive BFD Timers


50 ms BFD timers are very aggressive for high-throughput environments.


Even minor jitter, CPU scheduling delay, or packet micro-loss can cause


"Control Detection Time Expired" events.


Why Does This Happen?


Common contributing factors include:

• Microburst-induced jitter

High-speed fabrics may momentarily delay control packets even when data traffic is unaffected.


• MTS buffer delays inside ACI

Under transient CPU load, ACI may drop or delay BFD packets.


• Optical power momentarily near threshold

Brief power fluctuations (not long enough to generate LOS alarms) can break multiple consecutive BFD packets.


• Peer-side packet scheduling delays

Routers under load may momentarily miss a 50 ms BFD transmission interval.

When timers are too aggressive, these brief delays cause false BFD session drops.


Recommended Permanent Fix – Relax BFD Timers

A stable and widely recommended configuration is:

  • 300 ms Min TX

  • 300 ms Min RX

  • Multiplier 3

This results in:


900 ms total detection time

Fast enough for convergenceStable enough to prevent false flaps



Verification Commands


vsh -c "show bfd neighbors detail"

show system internal bfd event-history session

show system internal ethpm event-history interface <interface>

moquery -c bfdIfP



Recent Posts

See All
MultiCast In ACI

Understanding Multicast in Cisco ACI 1. Multicast Traffic Flow in ACI In ACI, multicast traffic is primarily managed within Bridge...

 
 
 
Quality of Service (QoS) in Cisco ACI

Configuring Quality of Service (QoS)  in Cisco ACI (Application Centric Infrastructure)  involves creating and applying QoS policies that...

 
 
 
Cluster Migration from an MSO to an NDO

Step-by-Step instructions for migrating Cisco Multi-Site Orchestrator (MSO) to Cisco Nexus Dashboard Orchestrator (NDO) . Cisco MSO to NDO Migration Guide 1. Understanding the Migration Cisco MSO (Mul

 
 
 

Follow me

© 2021 by Mukesh Chanderia
 

Call

T: 8505812333  

  • Twitter
  • LinkedIn
  • Facebook Clean
©Mukesh Chanderia
bottom of page