top of page
Writer's pictureMukesh Chanderia

ACI Multi-Site Orchestrator (MSO) Tshoot - Part 1

Troubleshooting Tools: Instructions on using tools like the Multi-Site troubleshooting report, API call logs, VM data collection, and logs to verify microservices and policy resolution on Cisco APIC sites. It also covers:

  • Consistency checker

  • Docker container information

  • Executing logs

  • APIC policy resolution



Consistency Checker


The Consistency Checker is a feature in Cisco ACI Multi-Site Orchestrator that helps you verify deployments after they've been initially deployed. It integrates seamlessly within the user interface to ensure that cross-site mappings are correct.


You can use this tool on any template that has been deployed across at least two sites and includes at least one of the following policies:

  • Endpoint Group (EPG)

  • Virtual Routing and Forwarding (VRF)

  • Bridge Domain (BD)

  • External EPG



Verifying a Deployed Template Across Sites


To ensure your deployed templates are consistent across multiple sites, follow these steps:


Before You Begin

Ensure that the template you want to verify:

  • Has been deployed across at least two sites.

  • Contains at least one of these policies: EPG, VRF, BD, or External EPG.

Steps

  1. Log In: Access the Multi-Site Orchestrator GUI.

  2. Select the Schema:

    • Navigate to the Schemas section from the main menu.

    • On the Schema List page, choose the appropriate schema.

  3. Choose the Template:

    • Click on the deployed template you wish to verify.

  4. Initiate Verification:

    • In the top-right corner, click on Unverified.

  5. Run the Consistency Checker:

    • In the Template Verification Summary dialog box, click VERIFY.

    • A message will appear: "Consistency verification has been successfully triggered."

  6. Review Verification Status:

    • The status will update to either:

      • Verification Successful — No action needed.

      • Verification Failed — Action required.

    • If it failed:

      • Click on Verification Failed.

      • For the site(s) that failed, click the pencil icon to view a detailed report.

      • Hover over the red X to see the issue description, which could be:

        • Not Found — Unable to locate the policy.

        • Mismatch — Misconfiguration detected.

      • You can then:

        • Download the report for the current site.

        • Verify Template across all sites again.


Setting Up Scheduled Verification for Deployed Templates


Automate the verification process for every deployed template on a per-tenant basis:

Steps

  1. Log In: Access the Multi-Site Orchestrator GUI.

  2. Access Tenant Settings:

    • Go to the Tenant section from the main menu.

    • On the Tenant List page, click Set Schedule for the desired tenant.

  3. Configure the Schedule:

    • In the Consistency Checker Scheduler Settings, uncheck Disable Schedule.

    • Select the preferred time and frequency for verification.

    • Click OK to save your settings.



Troubleshooting Verification Errors


If you encounter errors during verification, here's how to troubleshoot them:

Steps

  1. Log In: Access the Multi-Site Orchestrator GUI.

  2. Open the Schema Health Dashboard:

    • Navigate to the Dashboard.

    • In the Schema Health section, click on the schema verification icon in the View By field.

    • You'll see small squares representing templates within each site, color-coded by status:

      • Green — Passed verification.

      • Red — Failed verification.

      • Yellow — Unverified.

  3. Identify Issues:

    • Expand any schema containing a red indicator to reveal the problematic templates.

    • Hover over red sites to see that they have FAILED.

  4. View Detailed Report:

    • Click on a failed site to open a detailed report.

    • Hover over the red X icons to read descriptions of the issues:

      • Not Found — The policy couldn't be located.

      • Mismatch — There's a configuration mismatch.

  5. Take Action:

    • Choose to Download the report for the current site or Verify Template across all sites.

  6. Check Template Statuses:

    • Review which templates have passed, failed, or remain unverified.

  7. Optional Actions:

    • To verify the entire schema, click the ... (ellipsis) next to the schema name and select Verify Schema.

    • To search for specific policies (EPG, BD, VRF, or External EPG), use the search function to find which schemas contain them.


Downloading System Logs


Generate a troubleshooting report and download system logs for all schemas, sites, tenants, and users managed by Cisco ACI Multi-Site Orchestrator:

Steps

  1. Log In: Access the Multi-Site Orchestrator GUI.

  2. Navigate to Tech Support:

    • From the main menu, select Operations > Tech Support.

  3. Download Logs:

    • In the System Logs frame, click the Edit button in the top-right corner.

    • Select the logs you wish to download.

    • Click the Download button.

    • An archive file will be downloaded to your system, containing:

      • All schemas in JSON format.

      • All site definitions in JSON format.

      • All tenant definitions in JSON format.

      • All user definitions in JSON format.

      • All container logs in the infra_logs.txt file.




Gathering Docker Container Information


You can access one of the Orchestrator VMs to collect information about Docker services and their logs for specific containers. Here’s how to do it:


1) Checking Docker Container Health


To ensure Docker services are running smoothly, use the following command:


# docker service ls


This command displays the health status of each service. Look at the REPLICAS column to make sure all containers are running as expected. If any container is down, it may indicate an issue that needs attention.


Example Output:


ID NAME MODE REPLICAS [...]

ve5m9lwb1qc4 msc_auditservice replicated 1/1 [...]

bl0op2eli7bp msc_authyldapservice replicated 1/1 [...]

uxc6pgzficls msc_authytacacsservice replicated 1/1 [...]


2) Finding Container IDs


To list all running container IDs, use:


# docker ps


Example Output:


CONTAINER ID IMAGE COMMAND [...]

05f75d088dd1 msc-ui:2.1.2g "/nginx.sh" [...]

0ec142fc639e msc-authyldap:v.4.0.6 "/app/authyldap.bin" [...]


3) If you need the container ID for a specific service, use:


# docker ps | grep <service-name>


Example:


docker ps | grep executionengine


Output:


685f54b70a0d msc-executionengine:2.1.2g "bin/executionengine" [...]


4) To include containers that have stopped, use:


# docker ps -a | grep <service-name>


Example:


docker ps -a | grep executionengine


Output:


685f54b70a0d msc-executionengine:2.1.2g "bin/executionengine" Up 2 weeks (healthy)

3870d8031491 msc-executionengine:2.1.2g "bin/executionengine" Exited (143) 2 weeks ago


5) Viewing Container Logs


To see the logs for a specific container, use:


# docker logs <container-id>


Note: Logs can be large due to extensive data transfer. Ensure your network can handle the download speed.


6) Log File Location:


/var/lib/docker/containers/<container-id>/

You might find multiple log files like <container-id>-json.log.


Example:


# cd /var/lib/docker/containers

ls -al


total 140

drwx------. 47 root root 4096 Jul 9 14:25 .

drwx--x--x. 14 root root 4096 May 7 08:31 ..

drwx------. 4 root root 4096 Jun 24 09:58 051cf8e374dd9a3a550ba07a2145b92c6065eb1071060abee12743c579e5472e

drwx------. 4 root root 4096 Jul 11 12:20 0eb27524421c2ca0934cec67feb52c53c0e7ec19232fe9c096e9f8de37221ac3


7) To view logs for a specific container:


cd 051cf8e374dd9a3a550ba07a2145b92c6065eb1071060abee12743c579e5472e/

ls -al


total 48

drwx------. 4 root root 4096 Jun 24 09:58 .

drwx------. 47 root root 4096 Jul 9 14:25 ..

-rw-r-----. 1 root root 4572 Jun 24 09:58 051cf8e374dd9a3a550ba07a2145b92c6065eb1071060abee12743c579e5472e-json.log

drwx------. 2 root root 6 Jun 24 09:58 checkpoints

-rw-------. 1 root root 4324 Jun 24 09:58 config.v2.json

-rw-r--r--. 1 root root 1200 Jun 24 09:58 hostconfig.json

-rw-r--r--. 1 root root 13 Jun 24 09:58 hostname

-rw-r--r--. 1 root root 173 Jun 24 09:58 hosts

drwx------. 3 root root 16 Jun 24 09:58 mounts

-rw-r--r--. 1 root root 38 Jun 24 09:58 resolv.conf

-rw-r--r--. 1 root root 71 Jun 24 09:58 resolv.conf.hash


8) Viewing Docker Networks


To see all Docker networks, use:


# docker network list


NETWORK ID NAME DRIVER SCOPE

c0ab476dfb0a bridge bridge local

79f5e2d63623 docker_gwbridge bridge local

dee475371fcb host host local



Generating the API Call Logs


You can access the Multi-Site Orchestrator API call logs through the Infra Logs in a Troubleshooting Report.


You can also access the API call logs Multi-Site with the following steps:


Procedure


Step 1

Locate the worker node that has the msc-executionengine service running, as in the following example:


Example:

[root@worker1 ~]# docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

1538a9289381 msc-kong:latest "/docker-entrypoin..." 2 weeks ago Up 2 weeks 7946/tcp, msc_kong.1.ksdw45p0qhb6c08i3c8i4ketc

8000-8001/tcp, 8443/tcp

cc693965f502 msc-executionengine:latest "bin/executionengine" 2 weeks ago Up 2 weeks (healthy) 9030/tcp msc_executionengine.1.nv4j5uj5786yj621wjxsxvgxl

00f627c6804c msc-platformservice:latest "bin/platformservice" 2 weeks ago Up 2 weeks (healthy) 9050/tcp msc_platformservice.1.fw58jr62dfcme4noh67am0s73

In this case, on cc693965f502 the image is msc-executionengine:latest, find the -json.log, that contains the API calls from Multi-Site to the APIC controllers.


Step 2

Enter the command in the following example:


Example:

# cd /var/lib/docker/containers/cc693965f5027f291d3af4a6f2706b19f4ccdf6610de3f7ccd32e1139e31e712

# ls

cc693965f5027f291d3af4a6f2706b19f4ccdf6610de3f7ccd32e1139e31e712-json.log checkpoints config.v2.json hostconfig.json hostname

hosts resolv.conf resolv.conf.hash shm


# less \


cc693965f5027f291d3af4a6f2706b19f4ccdf6610de3f7ccd32e1139e31e712-json.log | grep intersite

{"log":" \u003cfvBD name=\"internal\" arpFlood=\"yes\" intersiteBumTrafficAllow=\"yes\" unkMacUcastAct=\"proxy\"

intersiteL2Stretch=\"yes\"\u003e\n","stream":"stdout","time":"2017-07-25T08:41:51.241428676Z"}

{"log":" \"intersiteBumTrafficAllow\" : true,\n","stream":"stdout","time":"2017-07-27T07:17:55.418934202Z"}




Reading the Execution Log


The execution log provides three different kinds of log information:


Websocket refresh information that is printed out every 5 minutes.


2017-07-11 18:02:45,541 [debug] execution.serice.monitor.WSAPicActor - WebSocket connection open

2017-07-11 18:02:45,542 [debug] execution.serice.monitor.WSAPicActor - Client 3 intialized

2017-07-11 18:02:45,551 [debug] execution.serice.monitor.WSAPicActor - WSAPicActor stashing message Monitor Policy(WSMonitorQuery(/api/class/fvRsNodeAtt,?subscript

2017-07-11 18:02:45,551 [debug] execution.serice.monitor.WSAPicActor - WSAPicActor stashing message RefreshClientTokenFailed()


The schema to push and the plan being generated.


Websocket monitoring VNID for cross VNID programming.


Note the following signs of errors:


Log lines starting with a red error.


Stacktrace for exceptions.



Verifying Policy Resolution on APIC Sites


In this task, use a REST API MO query on local APIC sites or switches to view the policies resolved on an APIC, for a site managed by Cisco ACI Multi-Site.


For diagrams of the managed objects (MO) relationships, see the Cisco APIC Management Information Model Reference (MIM). For example, in the MIM, see the diagram for fv:FabricExtConnP.


Procedure


Step 1

To view details for the logical MOs under the Fabric External Connection Profile (fabricExtConnP), log on to the APIC CLI and enter the following MO query:


Example:

admin@apic1:~> moquery -c fvFabricExtConnP -x "query-target=subtree" | egrep "#|dn"

# fv.IntersiteMcastConnP

dn: uni/tn-infra/fabricExtConnP-1/intersiteMcastConnP

# fv.IntersitePeeringP

dn: uni/tn-infra/fabricExtConnP-1/ispeeringP

# fv.IntersiteConnP

dn: uni/tn-infra/fabricExtConnP-1/podConnP-1/intersiteConnP-[5.5.5.1/32]



Step 2

To view the logical MOs for the L3Out used for Multi-Site connections, log on to the APIC CLI and enter an MO query, such as the following:


Example:

admin@apic1:~> moquery -c l3extOut -x "query-target=subtree" | egrep "#|dn.*intersite" | grep -B 1 dn


# bgp.ExtP

dn: uni/tn-infra/out-intersite/bgpExtP

# fv.RsCustQosPol

dn: uni/tn-infra/out-intersite/instP-intersiteInstP/rscustQosPol

# l3ext.InstP

dn: uni/tn-infra/out-intersite/instP-intersiteInstP

# bgp.AsP

dn: uni/tn-infra/out-intersite/lnodep-node-501-profile/infraPeerP-[6.6.6.3]/as

# bgp.RsPeerPfxPol




Step 3

To view the resolved MOs for an APIC local site, log on to the APIC CLI and enter an MO query such as the following:


Example:

admin@apic1:~> moquery -c fvSite -x "query-target=subtree" | egrep "#|dn"

# fv.RemoteBdDef

dn: resPolCont/sitecont/site-6/remotebddef-[uni/tn-msite-tenant-welkin/BD-internal]

# fv.RemoteCtxDef

dn: resPolCont/sitecont/site-6/remotectxdef-[uni/tn-msite-tenant-welkin/ctx-dev]

# fv.RemoteEPgDef

dn: resPolCont/sitecont/site-6/remoteepgdef-[uni/tn-msite-tenant-welkin/ap-Ebiz/epg-data]



Step 4

To view the concrete MOs on a switch for a Multi-Site site, log on to the switch and enter an MO query such as the following:


Example:

spine501# moquery -c dci.LocalSite -x "query-target=subtree" | egrep "#|dn"

# l2.RtToLocalBdSubstitute //(site5 vrf 2195456 -> bd 15794150 is translated to

site6 vrf 2326528 -> bd 16449430)

dn: sys/inst-overlay-1/localSite-5/localCtxSubstitute-[vxlan-2195456]/localBdSubstitute-

[vxlan-15794150]/rttoLocalBdSubstitute-[sys/inst-overlay-1/remoteSite-6/remoteCtxSubstitute-

[vxlan-2326528]/remoteBdSubstitute-[vxlan-16449430]]

# l2.LocalBdSubstitute

dn: sys/inst-overlay-1/localSite-5/localCtxSubstitute-[vxlan-2195456]/localBdSubstitute-

[vxlan-15794150]


What to look for: The output displays the data translated between sites. In this example, the original data on the sites was as follows:


site5 vrf msite-tenant-welkin:dev -> vxlan 2195456, bd internal -> vxlan 15794150, epg web: access-encap 200 → pcTag 49154, access-encap 201 → pcTag 16387


site6 vrf msite-tenant-welkin:dev -> vxlan 2326528, bd internal -> vxlan 16449430, epg web: access-encap 200 ->pcTag 32770,access-encap 201 ->pcTag 16386


Step 5


To verify the concrete MOs for a remote site, enter an MO query such as the following:


Example:

spine501# moquery -c dci.RemoteSite -x "query-target=subtree"

| egrep "#|dn"

# dci.AnycastExtn

dn: sys/inst-overlay-1/remoteSite-6/anycastExtn-[6.6.6.1/32]

// attribute is_unicast is Yes, Unicast ETEP

# dci.AnycastExtn

dn: sys/inst-overlay-1/remoteSite-6/anycastExtn-[6.6.6.2/32]

// attribute is_unicast is No, Multicast ETEP

# l2.RsToLocalBdSubstitute

dn: sys/inst-overlay-1/remoteSite-6/remoteCtxSubstitute-[vxlan-2326528]/remoteBdSubstitute-

[vxlan-16449430]/rsToLocalBdSubstitute



18 views0 comments

Recent Posts

See All

Comments


bottom of page