Troubleshooting Logical Constructs
Check if leaf has required vrf
Leaf# show vrf all
2. Check on VRF if "Policy Control Enforcement Preference" is Enforced (requires contract between EPG - Default) or Unenforced (All EGPs inside VRF can communicate without restriction.)
3. Check vzAny
4. Check BD: it must have VRF & other options i.e. Hardware Proxy / Flood .... L3 Unknown Multicast Flooding is flood or optimized flood & multi destination flooding is set to Flood in BD / Drop / Flood in Encapsulation.
Apart from above also check if "ARP flooding" & "Limit Local IP Learning To BD/EPG Subnet" check.
4. Policy-- L3 configuration: Any custom MAC address or unicast Routing is checked or not.
Next is to check subnet in BD or EPG
To see all subnets on VRF
Leaf # show ip interface brief vrf Sales:Presales_VRF
iping Command
iping -V Tenant1:VRF1 -s 10.0.2.254 10.0.2.1
APIC Troubleshooting
Bond 0 : In-band Maangement
Bond 1 : OOB Management
Leaf1# show lldp neighbours --> You must see APIC
APIC# cat /proc/net/bonding/bond0
APIC# acidiag cluster
APIC# acidiag avread
APIC # acidiag verifyapic
APIC# cluster_health
Change cluster size
Note: If you want to reduce cluster size then first decom the unwanted APIC
If there are three APICs & one of them is down the still replica will be in read write.
Troubleshooting Endpoint Learning
Spine # show coop internal info ip-db | grep -E "address|Vrf"
End Point Retention Policy
Go to BD --> Policy
Data Plane Learning is on VRF
System -->System Settings --> EndPoint Controls
Loop Detection
Rogue EP Control
System -->System Settings --> Fabric-Wide Settings
Disable Remote EP Learning & Enforce subnet check
leaf-a# itraceroute src-ip 10.0.1.1 10.0.1.2 vrf Sales:Presales_VRF encap vlan 11
On leaf-a, issue itraceroute to the APP_VM (10.0.1.2) using the source IP address of the WEB_VM (10.0.1.1)
Access VLAN: Classifies traffic into a security Endpoint Group (EPG)
Bridge Domain (BD) VLAN: Maps L2 traffic into a bridge domain VLAN Network Identifier (VNID)
Flood Domain (FD) VLAN : Leaf locally significant VLAN assigned to an access VLAN within an EGP.
To check ip and mac address of all Endpoints present over vrf
Leaf # show endpoint vrf Tenant1:VRF1 detail
To see vlans provisioned to ports
Leaf # show vlan brief
Leaf # show vlan extended
Leaf # show mac address-table vlan 20
The simple output that provides the connection for a specific FD_VLAN and the access encapsulation VLAN and BD_VLAN can be obtained as in this example for FD_VLAN 20:
The mappings between the access encapsulation and internal VLAN for EPG and EPG to BD_VLAN can also be obtained by connecting to the module using vsh_lc command. Then issue the show system internal eltmc info vlan brief command.
Types of VLANs used in the Cisco ACI:
Encap: The encapsulation (VLAN or VXLAN) of the virtual machine manager (VMM) for the associated endpoint group. This is the VLAN assigned to the endpoint group that the attached endpoint is in.
Bridge domain VLAN: BD-VLAN is used to represent a bridge domain and can link multiple flood domain VLANs (FD-VLANs) together with multiple hardware VLANs and internal VLANs. The forwarding aspect is used by the Broadcom ASIC to determine if traffic should be locally switched or forwarded to the NorthStar ASIC for processing. The BD-VLAN connects different local FD-VLANs to a single bridge domain, and is used on the Broadcom ASIC to determine the Layer 2 broadcast domain which might contain multiple subnets or ACCESS_ENC.
Flood domain VLAN: The FD-VLAN is the forwarding VLAN used to forward traffic on the Broadcom ASIC. The FD_VLAN is directly linked to the ACCESS_ENC and is also referred to as the internal VLAN. The FD_VLAN is used to represent the ACCESS_ENC instead of linking it directly to the BD_VLAN. The FD_VLAN allows the BD_VLAN to link to different ACCESS_ENCs and treat them as if they are all in the same 802.1Q VLAN on an NX-OS switch. When a broadcast packet comes into the leaf switch from the ACI fabric, the BD_VLAN can map to several FD_VLANs to allow the packet to be forwarded out different ports using different ACCESS_ENCs. The FD_VLAN is used to learn Layer 2 MAC addresses.
Platform independent VLAN: The PI-VLAN is commonly seen when executing switch show commands. PI VLANs are locally significant to each switch; and are not consistent across all leafs.
Troubleshooting L3
To check L3 drops
Tenant --> Operational --> Flows --> L3 drop
To see above information from switch.
Leaf # show logging ip access-list internal packet-log deny | more
Let's say connectivity between two L3 out endpoints has to be verified.
Step 1: See is external host is reachable through L3out
Leaf # iping -V Tenant1:VRF1 -s 10.2.1.1 2.2.2.2
10.2.1.1 --> interface which is being used to make neighborship for L3out
Step 2: show ip route vrf Tenant:VRF
See if subnet is there is routing table on border leaf for external network.
Step 3: check neighbourship
Step 4: Check the interface on which L3out has been created.
Troubleshooting VMware
You can verify that the VM's network adapter is connected to the appropriate network (port group). So, when you are using the WebUI you should ensure that the connected option is checked. When using the client, connected (if the VM is powered on) and connect at power on should be chosen.
The VMs using their virtual NICs (vNIC) can communicate through the virtual switch (vSwitch) on the hypervisor. There are two different types of vSwitches in VMware:
Standard virtual switch: Configured on each VMware ESXi host (the span is limited to its host) and does not require vSphere. This vSwitch cannot be used with VMM integration in Cisco ACI, but it can be used with Cisco ACI physical domains.
Distributed virtual switch (DVS): Configured in one place, while the configuration is distributed. It requires VMware vSphere, and Enterprise Plus license. Hence, the DVS is required for VMM integration in Cisco ACI.
Each vSwitch, has the following elements:
Virtual ports:
Mapped to vNICs on the VMs.
Port groups:
Mapped to VLANs that are utilized on the physical network.
Represent a Layer 2 broadcast domain, since there is Layer 2 connectivity between virtual ports belonging to the same port group.
Uplinks:
Mapped to a physical NIC (pNIC, or also referred as VMNIC) to provide connectivity to the outside network.
vSwitch has exclusive use of the VMNIC.
You can verify that the VM's network adapter is connected to the appropriate network (port group). So, when you are using the WebUI you should ensure that the connected option is checked. When using the client, connected (if the VM is powered on) and connect at power on should be chosen.
Click on Physical Adapter --> click on blue icon for vmnic0
Click on VM Network
Verifying Leafs
The leaf switches in Cisco ACI fabric provide connection for the servers, which can serve as hypervisor hosts in the data center. Servers can be rack-mount units, such as Cisco UCS C-Series, or blade servers, as Cisco UCS B-Series.
To benefit from the Cisco ACI fabric functionalities, servers should have dual-homed connection to the leaf switches, as in this figure:
To utilize both uplink connection from the server to the Cisco ACI fabric, you can use MAC pinning or Link Aggregation Control Protocol (LACP) configuration to have active-active uplinks.
However, if you are using Cisco UCS blade servers, you can only implement MAC pinning on the server side for active-active configuration.
That is because Cisco UCS Fabric Interconnects do not support LACP or vLACP on the southbound ports towards the blade servers.
With Cisco UCS blade servers, the server links (vNIC on the blades) are associated with a single uplink port, which referred to as pinning, while the selected external interface is called a pinned uplink port.
You can configure a static or dynamic pinning process when you configure the vNICs. When using LACP, the load-balancing method for active-active uplinks can be based on the IP hash.
If you are using the route based on IP hash option for load balancing, it requires that a port channel or vPC is configured on the leaf switch or leaf switches, respectively.
The following are common problems during integration of Cisco ACI and VMM:
Wrong credentials for the VMM (such as VMware vCenter, Microsoft System Center Virtual Machine Manager [SCVMM], and so on.) For example, credentials can be wrong in the first place, can be changed, or are no longer valid.
Wrong permissions are assigned to the account that you using for the connection to the VMM, such as the account for the vCenter credential information in the APIC GUI.
Wrong data center hostname (or IP address), such as when you specify the hostname or IP address for the vCenter controller.
Out of VLANs in dynamic VLAN pools. For example, when you do not allocate enough VLANs in the defined range.
Inconsistent port group configurations, due to:
Disassociating the VMM domain from the EPG when the VMs are still attached to the port group (order of operations).
Deleting the port group in VMware (wrong direction).
Manually changing the VLAN that is assigned to the port group.
The service graph is always associated with a contract between two EPGs. A service graph template is a sequence of Layer 4 to Layer 7 functions, Layer 4 to Layer 7 devices.It represents a reusable, generic representation of the expected traffic flow that defines connection points and nodes.
Cisco ACI supports different deployment modes for Layer 4-7 devices with the service graph, such as Go-To mode (also known as routed mode) where the traffic is routed on the Layer 4-7 service device (for example, it can be the default gateway for the servers).
It also can be Go-Through mode (also known as transparent mode or bridged mode), where the default gateway for the servers is the client-side bridge domain, and the Layer 4-7 device bridges the client-side bridge domain and the server-side bridge domain.
A concrete device represents a service device, for example, one load balancer, or one firewall. A concrete device can be either a physical device or a virtual machine.
Concrete device: Represents a service device, which can be physical or virtual.
Logical device: Represents a cluster of two devices, which can operate, for example, in an active/standby mode. It also defines the logical interfaces (defined in the device model) for device selection policy.
The deployment steps are the following:
Define the Layer 4-7 service device. The configuration can comprise a single device, two devices (such as an active-standby high-availability pair).
Create the service graph template.
Attach the service graph template to the contract subject. The service graph template must be associated with the contract (between EPGs)
The rendering layer determines which device should be used, if you are using two or more devices.
Layer 4–7 Service Insertion Modes
Unmanaged mode: Offers some configuration automation and simplification. It is a commonly used mode, where the configuration of the Layer 4–7 device is performed separately.
Managed mode: There is capability of pushing configuration from APIC to a service node via a device package.
Policy-based redirect (PBR): Utilizes PBR, as one of the main features of the service graph, where the Cisco ACI fabric can redirect traffic between security zones to Layer 4–7 devices. With PBR, the Layer 4–7 device does not need to be the default gateway for the servers.
Copy services: Unlike SPAN that duplicates all of the traffic, the Cisco ACI copy services feature enables selectively copying portions of the traffic between endpoint groups, according to the specifications of the contract. A copy service is configured as part of a Layer 4 to Layer 7 service graph template that specifies a copy cluster as the destination for the copied traffic.
Service chaining: The Layer 4–7 service insertion feature enables you to insert more than one service between two EPG, and create a service chain between them.
The following summarizes the steps you should take while verifying the PBR configuration:
Define the Layer 4–7 device (single, high-availability, or cluster) in Cisco APIC GUI using Tenants > Tenant_name > Services > L4–L7 > Devices, right-click to Create L4–L7 Devices:
The type of device (firewall, load balancing).
Where to find the device?
Virtual or physical (choose a domain)
2. Configure PBR policies using Tenants > Tenant_name > Policies > Protocol > L4-L7 Policy-Based Redirect, right-click to Create L4-L7 Policy-Based Redirect:
PBR policies define the next hop for the traffic that will be sent through the L4-L7 device.
Note: You may define them while applying the service graph template as well.
3. Define a service graph template, within the tenant, on Services > L4–L7 > Service Graph Templates, right-click to Create L4–L7 Service Graph Template.
The device that you will use.
The topology shows a function node that is connected to the consumer and provider EPGs.
4. Apply the service graph template, with right-click on the service graph template and choose Apply L4–L7 Service Graph Template
Choose the EPGs and contract (create new or reference an existing one) that instructs Cisco ACI which traffic to send to the device.
Choose how the device is connected.
Consumer connector: Redirect policy
Provider connector: Redirect policy
Resolution Immediacy: It is used for VMM domain. This option controls when VRF, bridge domains, and SVIs are programmed on the leaf nodes.
Pre-provision: The policy is configured to leaf regardless of Cisco Discovery Protocol or Link Layer Discovery Protocol (LLDP) relationship, even without a host connected to the VMM switch. For example, if an EPG is associated with a VMM domain, the bridge domain and the VRF to which the EPG refers are pushed on all of the leaf nodes, where the VMM domain is configured.
Immediate: Policy is configured on a leaf when a hypervisor that is connected to this leaf is attached to an APIC VMM DVS.A discovery protocol, such as Cisco Discovery Protocol/LLDP or the OpFlex protocol, is used to form the adjacency and discover to which leaf the virtualized host is attached.
On demand: Policy is configured on a leaf when a hypervisor that is connected to this leaf is attached to an APIC VMM DVS and at least one virtual machine on the host is connected to a port group and EPG that is associated with this physical NIC and leaf.
Deployment Immediacy: It is used for both Physical & VMM Domain. This option controls when contracts are programmed in the hardware.
Immediate: The policy CAM is programmed on the leaf when the policy is resolved to the leaf (see resolution immediacy, above), regardless of whether the virtual machine on the virtualized host has sent traffic.
On Demand: The policy CAM is programmed after the virtual machine sends the first packet, when the first data-plane packet reaches the leaf to trigger an endpoint learning for the EPG.
Verify Policy CAM Status
The main methods to program zoning-rules within Cisco ACI are as follows:
EPG-to-EPG contracts: Typically requires at least one consumer and one provider to program zoning-rules across two or more distinct endpoint groups.
Preferred groups: Requires enabling grouping at the VRF level, where all members of the group can communicate freely. Non-members require contracts to allow flows to the preferred group. Each VRF can have one preferred group.
vzAny: An EPG collection that is defined under a given VRF. vzAny represents all EPGs in the VRF. Usage of vzAny allows flows between one EPG and all EPGs within the VRF via one contract connection.
To view the available resources in Cisco ACI, choose Operations > Capacity Dashboard in the APIC GUI menu bar.
On GitHub
A fabric resource calculation tool is available.
FabricResourceCalculation/policyTCAM.py script is available.
Moquery Commands
Find the number of rules: Recommended stay under 50,000
apic# moquery -c actrlRule -x rsp-subtree-include=count | grep count
count : 24504
Find the number of DNs associated to the contract: Recommended to be under 80,000
apic# moquery -c vzRsRFltAtt -x rsp-subtree-include=count | grep count
count : 42587
The following figure shows two different configurations in Cisco APIC GUI, where the first one (left side) utilizes one filter with multiple filter entries, while the second one (right side) uses individual filters for the same services.
If you are using 7 individual filters, the product for a given contract would equal 7 for just one source and destination combination. However, the product for one source and 10 destinations would equal 70.
Static Routes in the fabric
Fabric > Inventory > Pod_number > Leaf_name > Protocols > IPV4, and chose IPv4 for a specific VRF (for example, VRF-T01:DB-VRF). From the working pane, choose Operational > Static Routes to inspect the static routes in the overlay, so you can find the leaked routes.
You can also use the APIC GUI via Fabric > Inventory > Pod_number > Leaf_name > Rules to insect the security zoning-rules.
You can also observe the packets in the APIC GUI, for EPG.
L3 Packet Drop from GUI
Tenant --> Operational --> Packet ---> L3 drop
Leaf # show logging ip access-list internal packet-log deny | more
Tenant --> Operational --> Resource IDs
Segment ID is the VXLAN ID that is assigned as a VRF tag. It is autogenerated by Cisco ACI and associated with the VRF.
Similar to the VRF Segment ID, all EPGs have a pcTag ID that is autogenerated by Cisco ACI and associated with each individual EPG. This tag is significant because it is used by Cisco ACI for communication between EPGs.
Leaf # show zoning-rule | more
Get the VXLAN ID (Segment ID) for VRF Presales_VRF
Leaf # show vrf Sales:Presales_VRF detail extended
It is 2195464
Leaf # show system internal epm endpoint ip 10.0.1.1
On the APIC, you could obtain the pcTags using the moquery command, even without learned endpoints.
apic1# moquery -d uni/tn-Sales/ap-eCommerce_AP/epg-Web_EPG
Leaf-a# show zoning-rule src-epg 49158 dst-epg 49156
leaf-a# show system internal policy-mgr stats | grep 2195464
ssh from web to app VM twice to see increase in packet
Comments