Table of Contents:
- Introduction
- 2 Items to check, Control Plane & Data Plane
2.a.Verifying Control Plane
2.b. Verifying Data Plane
2.c. Verify evpn on spine - References
Introduction
In this writeup I will go though some very simple troubleshooting steps that you can follow if you are having issues between ACI/AWS fabric and onPrem Fabric Tenant endpoint reachability.
Regardless of whether you are using Internet Connectivity for the connection or DX Connectivity these troubleshooing steps should help. The assumption is that you have already brought up cAPIC Fabric on AWS, done the necessary configurations from NDO, brought up Tenants and objects and you cannot reach from AWS endpoint (EC2) to onPrem endpoint (let’s say a VM).
2 Items to check, Control Plane & Data Plane
The important thing to note for troubleshooting, is that you have to look at 2 aspects:
- Control Plane
- Data Plane
The Control Plane is BGP EVPN between ACI OnPrem Spine and C8Kv routers.
The Data Plane is VXLAN
If any of them are broken you will have reachability problems.
Verifying Control Plane
a) check onprem leaf route table
Verify that you can see cloud cidr with next-hop is CSR Gig4 interface
show ip route vrf userVRF
b) check cloud vpc egress route tables (on AWS Tenant Account)
onprem subnet -> tgw
c) check tgw route table -> if they learn the route 0/0 through CSR (on AWS Infra Account)
d) check NSG rule
e) check zoning rule on onPrem
Below are screenshots from my lab on the items mentioned above.
a) Verify that the Tenant VRF shows the Cloud CIDR being advertised with NH of Gig4 of C8KV
C8KV has ip of 10.22.0.52
Figure 1: Looking at G4 IP of C8KV
Verifying that my Tenant VRF shows the VPC CIDR of 10.180.1.0/24 with NH of 10.22.0.52
Figure 2: Verifying on ACI Leaf that the next hop shows VPC CIDR with correct next hop
b) On Tenant AWS Account, Check the Route Table of the EC2 Subnet to verify that the onPrem subnet shows up with NH of tgw
Figure 3: Route Table of EC2 shows onPrem prefix with next hop of TGW
c) Checking TGW Route Table on Infra AWS Account to verify 0/0 is learnt through Connect attachment
Figure 4: Checking TGW route Table on Infra account for 0/0
Also Verify that the Connect Attachment connects to CSR G2
Figure 5: TGW Connect Attachment peers with C8KV Gig 2
d) check NSG rules
Make sure your ingress and egress rules are configued correctly for the EC2, (based on your contract configurations from NDO)
Figure 6: Checking NSG rules
e) check zoning rule on onPrem
Please see the below writeup to see how to verify the zoning rules (Normally this should not be a problem unless it’s misconfigured)
https://unofficialaciguide.com/2021/01/06/understanding-aci-tcam-utilization-optimization/
Verifying Data Plane
If the Control Plane looks good, but your reachability is still not working, chances are that the problem is in the Data Plane.
📙 It’s important to remember the following:
From: | To: | VXLAN Tunnel |
OnPrem | Cloud | onPrem Leaf –> C8Kv |
Cloud | onPrem | c8Kv –> onPrem Spine |
Below is the quick list of items to verify:
a. try to ping CSR g4 from onprem leaf
b. from CSR: ping BGP EVPN RID of spine
c. from CSR: show ip route
-> you should see onprem Infra Pool
d. Show nve peers on the Cat8kv. All these should be up. (these are the vxlan tunnels)
e. ping from onPrem and capture packets on the C8KV to verify if packets are coming in by using packet trace
Below are screenshots from my lab on the items mentioned above.
a) try to ping CSR g4 from onprem leaf
Looking for G4 IP on C8KV
Figure 7: Looking at Gig4 IP on C8KV
Pinging G4 IP from Leaf:
Figure 8: Pinging G4 IP from Leaf
b) from CSR: ping BGP EVPN RID of spine
First Get BGP RID of spine
Figure 9: Obtaining BGP RID of Spine
Pinging the RID of Spine from C8KV
Figure10: Pinging the BGP RID of Spine from C8KV
c) Checking for TEP Pool in Routing table of C8KV
Quick way to look at TEP Pool:
on onPrem APIC do:
cat /data/data_admin/sam_exported.config
Figure 11: Determining the TEP Pool configured on onPrem APIC
You can also check that the tep pools are used for the Destination TEPs from the leaf
show isis dtep vrf overlay-1
Figure 12: show isis dtep vrf overlay-1
Now check that those TEP routes shows from cloud C8KV
show ip route | i 10.7 # (notice its coming from Tu6 in this case)
Figure 13: TEP Pool Prefixes show on routing table of C8KV
d) show NVE Peers (to veify that VXLAN tunnels have established)
Figure 14: looking at NVE Peers
e) ping from onPrem and capture packets on the C8KV to verify if packets are coming in
I show using this at: https://unofficialaciguide.com/2020/07/17/aci-cloud-extension-usage-primer-azure-a-practical-guide-to-using-azure-vnet-peering-with-cloud-aci/
Please see right after Figure 41 in the writeup link above..
The commands on the CSRs to be used are as follows:
• debug platform packet-trace packet 128
• debug platform condition ipv4 10.70.5.4/32 both
• debug platform condition start
• debug platform packet-trace packet 128
To View:
• show platform packet-trace statistics
• show platform packet-trace summary
To Stop:
• debug platform condition stop
Other Useful Commands:
• show platform packet-trace code # Show packet-trace drop, inject or punt codes
• show platform packet-trace configuration #Show packet-trace debug configuration
• show platform packet-trace packet #Per packet details for traced packets
• show platform packet-trace statistics #Statistics for packets traced and packet disposition
• show platform packet-trace summary #Per packet summary information for traced packets
• clear platform packet-trace configuration
• clear platform packet-trace statistics
Verify evpn on spine
On the onPrem Spine use the commands listed to verify evpn network.
vsh
show bgp internal limited-vrf
show bgp internal network vrf name_of_vrf
Figure 15: Verifying evpn on Spine