top of page

pcTag (zoning-rule) & Policy TCAM

  • Writer: Mukesh Chanderia
    Mukesh Chanderia
  • Nov 21, 2023
  • 16 min read


Understanding pcTag in Cisco ACI


1. What is pcTag?

  • pcTag (Policy Control Tag):

    • A unique identifier assigned to each Endpoint Group (EPG) in Cisco ACI.


2. Assignment and Purpose

  • Assignment:

    • Assigned to an EPG when it is created.

  • Purpose:

    • Used in contract rules on leaf switches to control and secure network traffic.

    • These security rules are known as zoning rules.


3. Components of a Zoning Rule

Each zoning rule includes the following parameters:

  1. Source EPG (src pcTag):

    • The pcTag of the EPG where the traffic originates.

  2. Destination EPG (dst pcTag):

    • The pcTag of the EPG where the traffic is intended to go.

  3. Filter ID:

    • Defines the type of traffic, such as TCP traffic on destination port 3306.

  4. Scope:

    • Specifies the Virtual Routing and Forwarding (VRF) and Virtual Network ID (VNID) for both source and destination EPGs.

  5. Action:

    • Determines whether to permit or deny the traffic.


4. Core Parameters of a Zoning Rule

  • Source pcTag, Destination pcTag, and Filter ID:

    • These are the main elements that define what kind of traffic is allowed between which EPGs.


5. Scope Parameter

  • Definition:

    • Specifies the VRF in which the zoning rule applies.

  • Importance:

    • Ensures that pcTags are unique within each VRF.


6. Types of pcTags and Their Ranges

  1. System Reserved pcTag (1-15):

    • Used for internal system rules.

  2. Global pcTag (16-16385):

    • Unique across all VRFs.

    • Used for shared services like VRF route leaking.

  3. Local pcTag (16386-65535):

    • Unique only within a single VRF.

    • Default for internal EPGs and L3Out EPGs.


7. Default Behavior

  • EPGs in Different VRFs:

    • May have overlapping local pcTags.

    • This is acceptable because traffic remains within the same VRF.


8. pcTag Assignment with VRF Route Leaking

  • When VRF Route Leaking is Enabled:

    • EPGs that provide shared services are assigned a new global pcTag instead of their original local pcTag.

  • Provider EPGs:

    • Only EPGs configured for shared services receive a global pcTag.


9. Summary of pcTag Types and Usage

  • System Reserved (1-15):

    • Internal use.

  • Global (16-16385):

    • Across VRFs for shared services.

  • Local (16386-65535):

    • Within a single VRF for regular EPGs.


Key Takeaways


  • pcTag: A unique ID for each EPG, essential for defining and enforcing security rules.

  • Zoning Rules: Utilize pcTags to control traffic between EPGs based on defined parameters.

  • Scope: Ensures pcTags are unique within each VRF, crucial for maintaining proper traffic segregation.

  • pcTag Types:

    • System Reserved: For internal operations.

    • Global: For shared services across VRFs.

    • Local: For standard EPGs within a single VRF.

  • VRF Route Leaking: Assigns global pcTags to provider EPGs to facilitate shared services and cross-VRF communication.


Now, the two options while applying filter are:

  1. Apply Both Directions

  2. Reverse Filter Ports


Let’s break each down first:


1. Apply Both Directions

  • Disabled → The filter is applied only in the provider-to-consumer direction.

  • Enabled → The same filter entry is automatically applied in the reverse (consumer-to-provider) direction as well.

    • In other words, you don’t need to write two separate filters (one for each way).

Example:

  • Filter allows TCP/80.

  • If "Apply Both Directions = Enabled", then both directions (src/dst swap) are allowed on port 80.

  • If Disabled, then only the provider → consumer TCP/80 is allowed.


2. Reverse Filter Ports

  • This setting controls whether source/destination ports are also reversed when applying filters.

  • Useful for client-server scenarios:

    • Client sends src port = random high port, dst port = 80.

    • Server replies src port = 80, dst port = random high port.

Example:

  • If you have a filter only allowing TCP dst=80, the return traffic from the server (src=80, dst=high-port) would be dropped — unless “Reverse Filter Ports” is enabled.

  • When enabled, ACI installs a mirror filter entry (swaps source/dest ports), so the return traffic matches too.


Putting It Together: The 3 Scenarios


Scenario 1: Apply Both Directions – Disabled, Reverse Filter Ports – Disabled

  • Strictest case.

  • Contract allows only provider → consumer traffic matching exactly the filter definition.

  • No automatic reverse flows or port swapping.

  • You’d need two filters (client→server and server→client).

Used in very controlled security environments.❌ Easy to misconfigure (you’ll block return traffic if you forget the reverse entry).


Scenario 2: Apply Both Directions – Enabled, Reverse Filter Ports – Disabled

  • Contract is applied both directions, but ports are not swapped.

  • So the exact filter applies forward and backward.

  • Example: Allow TCP dst=80 → ACI also allows TCP dst=80 in the reverse direction.

  • But this does NOT cover the server reply (which is src=80, dst=high port).

Works if both sides use the same well-known port.❌ Doesn’t help in normal client-server (ephemeral ports).


Scenario 3: Apply Both Directions – Enabled, Reverse Filter Ports – Enabled

  • Easiest / most common for client-server apps.

  • Filter applies forward and backward, and source/dest ports are swapped.

  • Example: Filter TCP dst=80 → allows client→server (dst=80) and server→client (src=80).

  • Covers the typical return traffic pattern.


Best for most L4–L7 traffic (web, DB, etc.).❌ May be “too open” if not carefully controlled (because it adds reverse entries).

📌 Best Practice from Cisco Whitepapers

  • Scenario 3 (Apply Both Directions + Reverse Filter Ports enabled) is the default recommended setting for most contracts.

  • Only use Scenario 1 if you need fine-grained unidirectional control.

  • Scenario 2 is rare — useful only for protocols where the same port is used in both directions.


leaf-a# show system internal epm endpoint ip 10.0.1.1


MAC : 0050.5600.0001 ::: Num IPs : 1

IP# 0 : 10.0.1.1 ::: IP# 0 flags : ::: l3-sw-hit: No

Vlan id : 10 ::: Vlan vnid : 8393 ::: VRF name : Sales:Presales_VRF

BD vnid : 16056264 ::: VRF vnid : 2424832

Phy If : 0x1a002000 ::: Tunnel If : 0

Interface : Ethernet1/3

Flags : 0x80005c04 ::: sclass : 10931 ::: Ref count : 5


show system internal epm endpoint ip 10.0.1.1 | egrep " VRF vnid|sclass "

: 16056264 ::: VRF vnid : 2424832

0x80005c04 ::: sclass : 10931 ::: Ref count : 5


leaf-a# show zoning-rule scope 2490369



leaf-a# show zoning-rule scope 2424832 src-epg 16386 dst-epg 10931


Class ID and scope can be easily retrieved from the APIC GUI by opening the Tenant >

select the Tenant name on the left > Operational > Resource IDs > EPGs





Leaf # show zoning-rule scope 123456







show zoning-filter



Show zoning-filter 5



Policy TCAM Exhaustion in Cisco ACI


1. What is Policy TCAM Exhaustion?

  • TCAM (Ternary Content-Addressable Memory):

    • A type of memory in switch hardware where policies are stored for enforcement.

  • Issue:

    • When an Endpoint Group (EPG) uses a contract, the zoning rules on a leaf switch can use up many TCAM entries.

    • Result: This can lead to TCAM exhaustion, where there are no more entries available for new policies.


2. Optimizing Policy CAM Usage


Option 1: Set Policy Control Enforcement to Unenforced in VRF
  • Default Behavior:

    • Policy Control Enforcement is enabled by default.

    • Effect: EPGs cannot communicate unless there is a specific contract rule.

  • Unenforced Mode:

    • Action: Turn off Policy Control Enforcement.

    • Result:

      • No contract rules are applied.

      • Any endpoints can communicate freely as long as they are connected via Layer 2 or Layer 3.


Option 2: Use Contracts with vzAny

  • What is vzAny?

    • A managed object that links all EPGs within a VRF to one or more contracts.

    • Benefit: Avoids creating separate contract rules for each EPG.

  • How It Works:

    • Automatically applies contract rules to all EPGs in a VRF.

    • When a new EPG is added, vzAny automatically includes it in the contract rules.

  • Advantages:

    • Simplifies Configuration: Reduces the number of individual contract rules.

    • Saves TCAM Space: Combines multiple rules into one, lowering TCAM usage.

  • Example:

    • Without vzAny:

      • Rule 1: EPG 16401 → EPG 16402 (FTP)

      • Rule 2: EPG 16401 → EPG 16403 (FTP)

      • Rule 3: EPG 16401 → EPG 16404 (FTP)

    • With vzAny:

      • Rule: EPG 16401 → vzAny (All EPGs) (FTP)



Guidelines and Limitations:


  • Represents Everyone in the Same VRF:

    • Includes internal EPGs, external EPGs for L2Outs and L3Outs, and management networks.

  • Usage Restrictions:

    • Supported as Consumer: Can consume shared services.

    • Not Supported as Provider: Cannot provide shared services.

  • Communication Impact:

    • Using vzAny as a consumer allows any EPG in the consumer VRF to communicate with the provider VRF.

  • Scope Considerations:

    • If the contract scope is set to Application Profile, vzAny won’t save TCAM space as it will still create individual zoning rules.




Option 3: Use Contract Preferred Group
  • Purpose:

    • Simplifies configurations where multiple EPGs share the same contract.

  • Example Scenario:

    • Requirement: Allow EPGs 1 to 4 to communicate with each other without security restrictions.

    • Action: Create a preferred group contract that permits EPGs 1-4 to talk to each other freely.

    • Effect: Other EPGs will still follow the allow list model, maintaining security for the rest of the network.


To simplify such a configuration requirement to partially unenforced contract policies in the given VRF, ACI introduced Contract Preferred Group in the APIC release 2.2(1).



Preferred Group in Cisco ACI


  • Included and Excluded Members:

    • Included Members:

      • Specific Endpoint Groups (EPGs) are marked as "Included".

      • Example: EPG 1 to EPG 4 are designated as Included members.

    • Excluded Members:

      • All other EPGs that are not Included are grouped as "Excluded" members.

  • Communication Rules:

    • No Contracts Needed for Included Members:

      • EPGs within the Included group do not require any contract rules to communicate.

    • Free Communication:

      • Included EPGs can freely talk to each other without any security enforcement or restrictions.


To configure Contract Preferred Group, follow these steps:

  1. Enable the Preferred Group under the VRF.


  1. Add EPGs in the “Included” member. By default, all EPGs are defined as the “Excluded” member.



Note : When adding a L3Out EPG in the “Included” member, 0.0.0.0/0 with “External Subnets for the External EPG” scope is not supported. Use 0.0.0.0/1 and 128.0.0.0/1 instead.


Tools - To identify policy drop / packet drop


A) show system internal policy-mgr stats


The command "show system internal policy-mgr stats" is used to verify the number of hits per zoning rule.



leaf# show system internal policy-mgr stats

Requested Rule Statistics

Rule (4131) DN (sys/actrl/scope-2818048/rule-2818048-s-16410-d-25-f-424) Ingress: 0, Egress: 0, Pkts: 0 RevPkts: 0

Rule (4156) DN (sys/actrl/scope-2818048/rule-2818048-s-25-d-16410-f-425) Ingress: 0, Egress: 0, Pkts: 0 RevPkts: 0



Breakdown of Key Elements:


1. Rule Identification

Each rule is assigned a unique rule number (e.g., 4131, 4156) and a Distinguished Name (DN), which defines its scope and parameters.


The DN follows the structure:


sys/actrl/scope-<scope_id>/rule-<rule_id>-s-<source_id>-d-<destination_id>-f-<flow_id>


sys/actrl → This refers to the Access Control subsystem within the system (sys).


s-16410 → Represents the source

d-25 → Represents the destination

f-424 / f-425 → Internal flow identifiers



2. Counters Explained

Each rule contains four key counters that help track packet flow:


Ingress: Number of packets matching the rule entering the system.

Egress: Number of packets matching the rule leaving the system.

Pkts: Total number of packets processed by the rule.

RevPkts: Packets flowing in the reverse direction, if applicable.


Understanding f-424 and f-425


The "f-xxx" values (e.g., f-424, f-425) are internally generated identifiers that the policy manager uses to distinguish individual rule instances.


These are not manually configured but rather assigned automatically by the system to track, count, and report on specific rule instances.


f-424 appears to handle one direction of traffic (likely ingress).

f-425 likely handles the opposite direction (egress).



B) show logging ip access-list internal packet-log deny


A switch level command that can be run at iBash level which reports ACL (contract) related drops and flow-related information.


leaf# show logging ip access-list internal packet-log deny


[ Tue Oct 1 10:34:37 2019 377572 usecs]: CName: Prod1:VRF1(VXLAN: 2654209), VlanType: Unknown, Vlan-Id: 0, SMac: 0x000c0c0c0c0c,

DMac:0x000c0c0c0c0c, SIP: 192.168.21.11, DIP: 192.168.22.11, SPort: 0, DPort: 0, Src Intf: Tunnel7, Proto: 1, PktLen: 98

[ Tue Oct 1 10:34:36 2019 377731 usecs]: CName: Prod1:VRF1(VXLAN: 2654209), VlanType: Unknown, Vlan-Id: 0, SMac: 0x000c0c0c0c0c,

DMac:0x000c0c0c0c0c, SIP: 192.168.21.11, DIP: 192.168.22.11, SPort: 0, DPort: 0, Src Intf: Tunnel7, Proto: 1, PktLen: 98



Breaking Down the Output:

Each log entry provides detailed packet information that was denied. Let's analyze the key fields in detail:


Timestamp :


[ Tue Oct 1 10:34:37 2019 377572 usecs ]


Tue Oct 1 10:34:37 2019 → The date and time when the packet was denied.

377572 usecs → The precise microsecond timestamp of the event.



Context and VRF Information:



CName: Prod1:VRF1(VXLAN: 2654209)


CName: Prod1 → This is the tenant or fabric context name where the rule was applied.

VRF1 → The Virtual Routing and Forwarding (VRF) instance in which the packet was observed.

VXLAN: 2654209 → The VXLAN Network Identifier (VNI) for the encapsulated overlay network.



VLAN Information:


VlanType: Unknown, Vlan-Id: 0


VlanType: Unknown → This suggests the packet is encapsulated (such as VXLAN) and not associated with a traditional VLAN.

Vlan-Id: 0 → No explicit VLAN tag is associated.



MAC Addresses (Layer 2 Information):


SMac (Source MAC): 0x000c0c0c0c0c → The MAC address of the sender.


DMac (Destination MAC): 0x000c0c0c0c0c → The MAC address of the receiver.



IP Information (Layer 3):


SIP: 192.168.21.11, DIP: 192.168.22.11


SIP: 192.168.21.11 → The Source IP Address of the denied packet.

DIP: 192.168.22.11 → The Destination IP Address of the denied packet.



Port Information (Layer 4):


SPort: 0, DPort: 0



SPort: 0 (Source Port) and DPort: 0 (Destination Port) → Ports are set to 0, which usually indicates ICMP traffic (ping, traceroute, etc.).


If it were TCP/UDP, these would reflect actual port numbers.



Interface Information:


Src Intf: Tunnel7 → The denied packet was received on Tunnel7.

This suggests the traffic is part of an encapsulated tunnel, likely a VXLAN tunnel.



Protocol & Packet Size:


Proto: 1, PktLen: 98


Proto: 1 → Protocol 1 refers to ICMP (used for ping requests).

PktLen: 98 → The packet size was 98 bytes.


C) contract_parser.py


A Python script running on the leaf generates an output that aligns zoning rules,

filters, and hit statistics during ID-to-name lookups. This script is particularly valuable as it simplifies a multi-step procedure into a single command, which can be filtered for specific EPGs/VRFs or other contract-related values.


leaf# contract_parser.py

Key:

[prio:RuleId] [vrf:{str}] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags][contract:{str}] [hit=count]

[7:4131] [vrf:common:default] permit ip tcp tn-Prod1/ap-Services/epg-NTP(16410) tn-Prod1/l3out-L3Out1/instP-extEpg(25) eq 123

[contract:uni/tn-Prod1/brc-external_to_ntp] [hit=0]

[7:4156] [vrf:common:default] permit ip tcp tn-Prod1/l3out-L3Out1/instP-extEpg(25) eq 123 tn-Prod1/ap-Services/epg-NTP(16410)

[contract:uni/tn-Prod1/brc-external_to_ntp] [hit=0]

[12:4169] [vrf:common:default] deny,log any tn-Prod1/l3out-L3Out1/instP-extEpg(25) epg:any [contract:implicit] [hit=0]

[16:4167] [vrf:common:default] permit any epg:any tn-Prod1/bd-Services(32789) [contract:implicit] [hit=0]



The output of contract_parser.py provides a parsed view of contract rules within Cisco ACI. It shows the priority, rule ID, VRF, action, protocol, EPGs, contracts, and hit counts for policy enforcement. Let's break down the key elements.


Understanding the Key Structure


[prio:RuleId] [vrf:{vrf-name}] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags] [contract:{contract-name}] [hit=count]



[prio:RuleId] → Priority level and rule ID.

[vrf:{vrf-name}] → The VRF where the rule is applied.

action → Permit/Deny.

protocol → TCP, UDP, or any.

src-epg [src-l4] → Source EPG (Endpoint Group) and optional Layer 4 (L4) port.

dst-epg [dst-l4] → Destination EPG and optional Layer 4 (L4) port.

[flags] → Additional actions (like logging).

[contract:{contract-name}] → The contract defining the policy.

[hit=count] → Number of times the rule matched traffic.




Rule 1: NTP Traffic Allowed from Internal to External


[7:4131] [vrf:common:default] permit ip tcp tn-Prod1/ap-Services/epg-NTP(16410) tn-Prod1/l3out-L3Out1/instP-extEpg(25) eq 123 [contract:uni/tn-Prod1/brc-external_to_ntp] [hit=0]



Priority: 7

Rule ID: 4131

VRF: common:default

Action: permit

Protocol: IP TCP

Source EPG: tn-Prod1/ap-Services/epg-NTP(16410) (NTP service inside ACI fabric)

Destination EPG: tn-Prod1/l3out-L3Out1/instP-extEpg(25) (External EPG)

Port: eq 123 (NTP traffic)

Contract: brc-external_to_ntp

Hit Count: 0 (No matching packets yet)




Rule 2: NTP Traffic Allowed from External to Internal


[7:4156] [vrf:common:default] permit ip tcp tn-Prod1/l3out-L3Out1/instP-extEpg(25) eq 123 tn-Prod1/ap-Services/epg-NTP(16410) [contract:uni/tn-Prod1/brc-external_to_ntp] [hit=0]



Priority: 7

Rule ID: 4156

VRF: common:default

Action: permit

Protocol: IP TCP

Source EPG: tn-Prod1/l3out-L3Out1/instP-extEpg(25) (External EPG)

Destination EPG: tn-Prod1/ap-Services/epg-NTP(16410) (NTP service inside ACI fabric)

Port: eq 123 (NTP traffic)

Contract: brc-external_to_ntp

Hit Count: 0



Rule 3: Deny All Traffic from External EPG


[12:4169] [vrf:common:default] deny,log any tn-Prod1/l3out-L3Out1/instP-extEpg(25) epg:any [contract:implicit] [hit=0]



Priority: 12

Rule ID: 4169

VRF: common:default

Action: deny,log

Protocol: any

Source EPG: tn-Prod1/l3out-L3Out1/instP-extEpg(25) (External EPG)

Destination EPG: epg:any (All internal endpoints)

Contract: implicit

Hit Count: 0




Rule 4: Allow All Traffic to Services BD


[16:4167] [vrf:common:default] permit any epg:any tn-Prod1/bd-Services(32789) [contract:implicit] [hit=0]



Priority: 16

Rule ID: 4167

VRF: common:default

Action: permit

Protocol: any

Source EPG: epg:any (Any endpoint group)

Destination BD: tn-Prod1/bd-Services(32789) (Bridge domain for services)

Contract: implicit

Hit Count: 0



Cisco ACI Policy CAM Exhaustion Due to Scale Profile Mismatch


In Cisco ACI, leaf switch hardware resources are allocated based on the selected forwarding scale profile. These profiles decide how ASIC resources are divided across forwarding tables such as endpoint entries, LPM routes, Policy CAM, multicast entries, and other hardware resources.

A common operational issue can occur when a leaf is assigned a profile that favors one type of resource, such as route scale, while reducing another important resource, such as Policy CAM.


This article explains a generic Cisco ACI troubleshooting scenario where a leaf switch experienced Policy CAM exhaustion because it was running a route-heavy scale profile, while the actual workload required higher policy capacity.

No customer-specific details, node names, fabric names, or exact production counters are included.


Problem Statement

A Cisco ACI leaf switch showed full Policy CAM utilization.

The hardware resource output indicated that the leaf had reached the maximum available Policy CAM entries under its current scale profile.

At the same time, LPM route usage was low compared to the available LPM capacity.

In simple terms:

LPM usage        : Low
Policy CAM usage : Fully consumed
Active profile   : High LPM / route-heavy profile

This means the leaf had a large amount of route-scale capacity available, but no remaining Policy CAM capacity.

Why Policy CAM Matters

Policy CAM is used for hardware programming of ACI policy rules, including:

  • Contracts

  • Filters

  • Zoning rules

  • EPG-to-EPG policy enforcement

  • L3Out external EPG policy enforcement

  • vzAny policies

  • Taboo contracts

  • Service graph-related rules

  • Permit, deny, redirect, and other policy actions

When Policy CAM is exhausted, the switch may not be able to program additional policy rules in hardware. This can lead to deployment failures, policy programming faults, or traffic impact depending on which rules fail to install.

In practical terms:

Policy CAM full = no remaining hardware space for additional policy rules.

Key Observation

The active forwarding scale profile was confirmed from the leaf using:

moquery -c topoctrlFwdScaleProf

The output showed that the leaf was actively running a High LPM type profile.

This was important because High LPM profiles are designed for environments that require large route scale. However, they may reduce the amount of hardware space available for Policy CAM.

So the issue was not that the leaf had too many routes. The issue was that the active profile allocated too much hardware capacity toward route scale while leaving insufficient capacity for policy scale.

Understanding the Scale Profile Trade-Off

ACI forwarding scale profiles are based on trade-offs.

A route-heavy profile usually provides:

Higher LPM route capacity
Lower Policy CAM capacity

A policy-heavy or balanced profile usually provides:

Higher Policy CAM capacity
Lower or moderate LPM route capacity

This is expected behavior. The problem occurs when the selected profile does not match the actual workload of the leaf.

In this scenario, the observed usage pattern was:

Route/LPM resources : Underutilized
Policy resources    : Fully utilized

This indicates a resource allocation mismatch.

Root Cause

The root cause was a scale profile mismatch.

The leaf was running a route-heavy forwarding profile, but the actual workload was policy-heavy. Because of this, Policy CAM reached its maximum limit while route-scale resources remained largely unused.

Root Cause Summary

The leaf was using a High LPM scale profile, which prioritizes LPM route capacity and reduces available Policy CAM capacity. Since the actual workload required more policy entries rather than high route scale, Policy CAM became fully consumed under the selected profile.

This type of issue is generally not a hardware failure. It is a hardware resource allocation limitation caused by the chosen scale profile.

Commands Used for Troubleshooting

1. Check Policy CAM Usage

On the leaf:

vsh_lc
show platform internal hal health-stats | grep policy

What to check:

policy_count
max_policy_count
policy_otcam_count
max_policy_otcam_count

If the current policy count equals the maximum policy count, the leaf has exhausted the available Policy CAM under the current profile.

2. Check LPM Usage

On the leaf:

vsh_lc
show platform internal hal l3 routingthresholds

Important fields:

Maximum HW Resources for LPM
Current LPM Usage in Hardware
Number of times limit crossed
Last time limit crossed

What to check:

  • Is LPM usage close to the available maximum?

  • Has the LPM limit ever been crossed?

  • Is the leaf actually using the route-scale capacity provided by the current profile?

If LPM usage is low but Policy CAM is full, the active profile may not be suitable for the leaf workload.

3. Check Overall Hardware Resource Usage

On the leaf:

vsh_lc
show platform internal hal health-stats

Focus on these sections:

L2 stats
L3 stats
Policy stats
Mcast stats

Important fields include:

LPM entries
LPM TCAM entries
Endpoint entries
Policy count
Maximum policy count

This command helps compare policy usage with routing and endpoint usage.

4. Confirm Active Forwarding Scale Profile

On the leaf:

moquery -c topoctrlFwdScaleProf

Look for:

currentProfile
profType

This confirms the active forwarding scale profile currently applied on the switch.

This step is important because a scale profile change may require a switch reload before it becomes active. The configured profile and the active runtime profile may not always match until the switch has reloaded.

5. Check Zoning Rule Scale

On the leaf:

show zoning-rule | wc -l
show zoning-rule
show zoning-filter

These commands help determine whether the switch is carrying a high number of policy or zoning rules.

6. Check Policy Programming Errors

From APIC:

moquery -c faultInst -f 'fault.Inst.severity!="cleared"' | egrep -i "policy|zoning|tcam|cam|hardware|resource|capacity|hal|acl|qos"

On the leaf:

show logging logfile | egrep -i "policy|zoning|tcam|cam|hardware|resource|capacity|hal|acl|qos"

Also check:

vsh_lc
show system internal policy-mgr errors
show system internal policy-mgr event-history errors

These logs help confirm whether policy programming failures are already occurring.

How to Interpret the Findings

The following combination is the key indicator:

LPM usage        : Low
Policy CAM usage : Full
Active profile   : High LPM

This means the leaf has been allocated large route-scale resources that it is not using, while the policy table has no available capacity remaining.

That points to a scale profile mismatch rather than a pure routing scale issue.

Corrective Action

The recommended action is to review whether the High LPM profile is actually required on the affected leaf.

If the leaf does not need very high route scale, move it to a more suitable profile that provides higher Policy CAM capacity.

Depending on the design, possible profile options may include:

Profile Type

General Use Case

Balanced / Dual Stack

Balanced IPv4, IPv6, and policy requirements

IPv4-focused profile

Environments that mainly require IPv4 route scale with better policy capacity

High Policy profile

Environments with very high contract or policy scale requirements

High LPM profile

Environments with very high route-scale requirements

The exact profile should be selected based on:

  • Current IPv4 route scale

  • Current IPv6 route scale

  • Expected route growth

  • Endpoint scale

  • Contract and filter scale

  • L3Out design

  • Whether the leaf is a compute leaf, border leaf, service leaf, or mixed-use leaf

  • Future growth expectations

Important Operational Note

Changing the forwarding scale profile usually requires a reload of the impacted switch before the new profile becomes active.

Operational considerations:

  • Plan the change during a maintenance window.

  • If the switch is part of a vPC pair, reload one leaf at a time.

  • Verify endpoint redundancy before reload.

  • Confirm vPC health before and after reload.

  • Confirm the active scale profile after reload.

  • Re-check Policy CAM and LPM usage after reload.

Pre-Change Validation Checklist

Before changing the scale profile, validate the current state.

show vpc brief
show interface status
show endpoint summary
show ip route vrf all summary
show ipv6 route vrf all summary
show forwarding route summary

On the leaf:

vsh_lc
show platform internal hal health-stats
show platform internal hal l3 routingthresholds

From APIC:

moquery -c faultInst -f 'fault.Inst.severity!="cleared"'

Also confirm the role of the leaf:

Compute leaf
Border leaf
Service leaf
Shared services leaf
Mixed workload leaf

This matters because a border leaf may need more route scale, while a compute or service leaf may need more policy scale.

Post-Change Validation Checklist

After changing the scale profile and reloading the switch, validate the active profile:

moquery -c topoctrlFwdScaleProf

Then check hardware resource usage again:

vsh_lc
show platform internal hal health-stats | egrep "policy|lpm|route"
show platform internal hal l3 routingthresholds

Expected result:

The maximum Policy CAM value should increase based on the selected profile.
Policy usage should no longer be at the maximum limit.

Also check for policy programming errors:

vsh_lc
show system internal policy-mgr errors
show system internal policy-mgr event-history errors

And confirm there are no active policy-related faults:

moquery -c faultInst -f 'fault.Inst.severity!="cleared"' | egrep -i "policy|zoning|tcam|cam|hardware|resource|capacity"

Best Practices

1. Match the scale profile to the leaf role

Do not select a route-heavy profile unless the leaf genuinely requires high route scale.

For example:

Border-heavy leaf  -> may need higher route scale
Policy-heavy leaf  -> may need higher Policy CAM scale
Compute leaf       -> often needs balanced endpoint and policy scale
Service leaf       -> may need higher policy and service graph capacity

2. Check resource usage before and after upgrades

Upgrades may expose existing scale pressure if a leaf was already close to a hardware limit.

Useful checks:

vsh_lc
show platform internal hal health-stats
show platform internal hal l3 routingthresholds

Collecting this data before and after upgrades helps identify whether resource usage changed significantly.

3. Monitor policy-heavy designs

Policy-heavy designs may include:

  • Many EPGs

  • Many contracts

  • Many filters

  • Many external EPGs

  • Broad vzAny usage

  • Taboo contracts

  • Service graph redirect

  • Microsegmentation

  • Shared services design

These designs should be monitored for Policy CAM usage.

4. Keep hardware resource headroom

Avoid operating any critical hardware table at full utilization.

When Policy CAM is full, even a small policy change may fail to program in hardware.

Final Summary

Policy CAM exhaustion in Cisco ACI is not always caused by a defect or hardware failure. In many cases, it can be caused by selecting a scale profile that does not match the leaf’s actual workload.

In this scenario, the leaf was running a route-heavy profile. Route-scale usage was low, but Policy CAM was fully consumed. This indicates that the switch needed a profile with more policy capacity rather than more route capacity.

The key lesson is:

Always match the forwarding scale profile to the actual role and workload of the leaf.

A proper scale-profile review, followed by a planned profile change and switch reload if required, can restore Policy CAM headroom and help prevent policy programming failures.

Recent Posts

See All
PBR Concepts

What is a Health Group? A Health Group is a configuration object used to group specific PBR destination interfaces—typically the consumer and provider interfaces of the same service node (such as a f

 
 
 
Active/Standby F5 Across Different ACI Pods

Normal L3Out vs Floating L3Out Explained Understanding Cisco ACI Multi-Pod Architecture In a Cisco ACI Multi-Pod design: Each Pod has an independent IS-IS control plane Endpoint learning is maintained

 
 
 
Multi-site Traffic Flow

This article explains how traffic flows between Endpoint Groups (EPGs) across multiple sites in Cisco ACI using Nexus Dashboard Orchestrator (NDO). We will walk through three common design scenarios a

 
 
 

Comments


Follow me

© 2021 by Mukesh Chanderia
 

Call

T: 8505812333  

  • Twitter
  • LinkedIn
  • Facebook Clean
©Mukesh Chanderia
bottom of page