ACI/Cloud Extension Usage Primer (Azure) – A Practical Guide to using Azure vNET Peering with Cloud ACI

In this blog post, I will discuss with a practical approach on how to utilize Azure Cloud’s vNET Peering feature which is now supported with Cisco Cloud ACI from release from cAPIC release 5.0.2e

So, What is vNET Peering in Azure and why should you use it ?

From Microsoft Azure Blogs, I found a very nice comprehensive discussion on this.

VNet Peering provides a low latency, high bandwidth connection useful in scenarios such as cross-region data replication and database failover scenarios. Since traffic is completely private and remains on the Microsoft backbone, customers with strict data policies prefer to use VNet Peering as public internet is not involved. Since there is no gateway in the path, there are no extra hops, ensuring low latency connections.

VPN Gateways provide a limited bandwidth connection and is useful in scenarios where encryption is needed, but bandwidth restrictions are tolerable. In these scenarios, customers are also not as latency-sensitive

Figure 1

If you have read the previous article on AWS Transit Gateway usage with Cloud ACI, this will seem very similar to you.  The idea is very similar, though in the Cloud ACI implementation there are a few differences due to  the underlying cloud architecture differences.    I’ll point these out as I go.

Just like AWS, TGW,   Azure vNET Peering is:

  • Non Transitive: meaning if vNET A is peered to vNET B and vNET B is peered to vNET C,  that does not imply that vNET A can talk to vNET C using vNET B as a transit vNET (by default).  That’s why Cisco uses a hub and spoke model for vNET Peering.  Else, it would become too much of a mes(s/h).
  • vNET Peering is based on static routes.  There is no concept of VRF.  This implies that you cannot have overlapping subnets for vNETs that use vNET Peering. Ofcourse you can use VGW and vNET Peering at the same time (just like you can use TGW and VGW in AWS at the same time)
  • Both vNET Peering in Azure and TGW in AWS have capability of using User Defined Rotues also known as UDRs

Inter Region Peering:

  • In Azure there are 2 kinds of vNET peering:
    • Local Peering – Peering in same region
    • Global Peering – Peering between regions
      • if a prefix is reachable by both Local vNET peering and Global vNET peering, by default, Local vNET peering has precedence (since it’s lower cost)
  • In AWS TGW,  each VPC can have TGW VPC attachment to TGW Service (also known as TGW acceptor).  TGW Gateways can peer to each other across regions.

Now that we’ve discussed vNET Peering and what it is, let’s take a practical approach and use Azure vNET Peering feature with Cisco Cloud APIC (5.0.2e and above)

As we build our ACI Fabric Tenant topology and utilize Azure’s vNET peering and investigate how this works.   Let’s start building the following topology.  To start off with, let’s just first build the shaded part of the figure below which is the Tenant and the VRFs.   I will only point out newer things that need to be done for using vNET peering and will not show every little step of clicking and accepting.  By now if you have followed the other writeups you will be very comfortable on using the MSO to build ACI objects in the cloud.  ACI objects are built the same way regardless of the underlying cloud provider or physical Fabric (with some minor differences).

Figure 2

In the figure above you will notice that VRF-WEB is in the West US region, whereas VRF-APP is in the East-US region.  The East US region also happens to be the home region for ACI Cloud.   The home region in ACI Cloud is where cAPIC is deployed.   The home region always has cloud CSRs deployed.  Other Regions for that particular fabric (ACI Site) can also have CSRs if you wish, but they are not considered home region.

If you recall from the 5.0.2 cAPIC for Azure Install article, for this particular fabric, my home region was East US and I had 2 more regions, Central US and West US where I chose not to deploy any cloud CSRs.   So, my only Cloud CSRs are in my East US home region.

Figure 3

In the Cisco Cloud ACI Azure Implementation using vNET peering, we use a hub and spoke model.

The HUB is always the Infra vNET that has CSRs (regardless of it being a home region or not).   In other words if you don’t have any CSRs defined in a region that region cannot be the HUB for vNET Peering purposes.

Now looking at my setup, since my cloud CSRs are in the East US region, that can be a HUB for vNET peering.   My other regions don’t have CSRs so, they cannot be HUBs.

Since I have a tenant  vNET in region WestUS and I’ve decided to use vNET Peering for that vNET, it has to peer with the EAST US HUB region for vNET Peering.

Also, please pay attention to the Azure Native NLB (Network Load Balancer) in the HUB vNET.   This NLB load balances the traffic to the 2 CSRs.   We’ll discuss more about this a bit later.

The diagram below should make this clear.

Figure 4

Contrast this to the AWS TGW setup that we did before.  You will immediately notice that for TGW purposes, we don’t care if the other region has CSRs or not.  Every Tenant VPC will have a TGW Attachment  with a TGW in it’s own region and the HUB TGWs peer to each other.  The figure below shows that example.

Figure 5

Below is a diagram showing ACI Cloud Azure Fabric with more than 1 region having CSRs.

Here you will notice the following:

  • The infra VNets are used as the Hub
  • CSRs in the Hub VNet act as the NVAs for Spoke to Spoke routing 
  • NLB is placed in front of the CSRs to perform load-balancing and failover
  • All Spokes VNets are peered with all Hub VNets(across regions) 
  • UDR (User Defined Routes)  in the Spokes VNets are redirecting the traffic to the CSR NLB for traffic destined to other spokes/sites 
Figure 5a

Going back to our example setup:

  • In This Case, I will create the ACI Tenant on a totally separate Azure Subscription (in same directory as Infra Tenant). ( current release does not support vNET Peering for Azure subscription on different Azure directory subscription)
  • Let’s make sure that before we start we give the proper permissions to the Azure Tenant Subscription.
  • There are 2 Roles we need to define:
  • On Tenant subscription define Contributor Role for cAPIC of Infra account (please see the article on — Adding Tursted/Untrusted Accounts in unofficial guide)
  • On Tenant subscription define Network Contributor Role for cAPIC of Infra account
Figure 6

After you configure your Schema/template in the MSO and map it to your Azure Tenant and create the VRFs, you will go to the Site Local implementation of your template and configure the details for the VRFs.   Here you will choose the region and the CIDRs/subnets.   When doing vNET peering only for the VRF, you don’t need to define a subnet for VGW gateway.

Figure 7

Make sure to check vNET peering and the HUB Network which is called “Default”

Figure 8

Do the same for the APP VRF

Figure 9

Enable vNET peering also for APP VRF

Figure 10

Before we configure the rest of the ACI Teant, i.e. EPGs, Contracts, etc, etc, let’s see what happened till now.

Go to the cAPIC Azure UI and click on Application Management / Cloud Context Profile.   Then Double click one or your Tenant VRFs

Figure 11

Look for the Health/Faults and if you see any problems look at Event Analysis for details.   Click on Cloud Resources and investigate the options.

Figure 12

On cAPIC go to Application Management / VRFs and double click on one of the user VRFs.

Figure 13

Like before check for Health/Faults and click on Event Analysis if you see any problems.   Click on Cloud Resources to investigate

Figure 14

Now, click on Cloud Resources / Virtual Networks and then double click on one of the tenant VRFs that you created

Figure 15

Next, click on Cloud Resources

Figure 16

In the next screen click on Virtual Network Peers

Figure 17

Here you will see the vNET Peers.   On each of the tenant defined VRFs you will see 2 vNET Peers.  This is because the vNET peering consists of 2 unidirectional peers.  One going to HUB vNET and the other coming in from HUB vNET

Figure 18

If you wnet to Infra vNET  overlay-1, you will see 4 vNET peers.  in/out to vNET APP and in/out to vNET WEB

Figure 19

You could also check the vNET peering from the Azure Console.  Go to Azure console UI and then go to overlay-1 vNET and look at Peerings.  Note that from Azure UI, it will show you 2 vNET peers only.  The way Azure UI shows it is that if the vNET peering has both in and out (meaning configure from overlay-1 side to vNET APP and then go to vNET APP and configure vNET Peering to overlay-1, then the vNET peering is considered completed and it will say connected for Peering State.  If you only do it from one side, it will say initializing for Peering State.

Figure 20

Similarly, looking in Azure UI from vNET App will show one connected Peer to infra vNET overlay-1

Figure 21

At this point let’s go ahead and configure the rest of the ACI objects for that tenant.  The un-shaded parts from the figure below

Figure 22

On azure console search in the search bar for vrf.  Choose one of your Tenant vrfs route table.  Let’s say  web_egress

Figure 23

Click on Routes.  You will see that the route to 10.80.0.0/16 (the APP VRF) is pointing to 10.33.1.36.   Also, if you click on Subnets you will see that the subnet for this vrf (WEB VRF) is 10.70.5.0/24

Figure 24

You will see a similar kind of reslut for APP VRF_egress

Figure 25

Now on Azure UI, search for load balancer

Figure 26

Choose your hub NLB (Network Load Balancer )

Figure 27

Click on Frontend IP configuration.  You will see that that 10.33.1.36 IP that we saw before is actually the Front End IP for the NLB

Figure 28

Now for that NLB, click the Backend Pools and you will see that the backend pools configured for that NLB happens to be the Gig 2 interfaces of your Azure Cloud CSRs.  ssh in to your cloud CSRs and verify.

Figure 29

If you click on Health probes, you will notice that the NLB is using tcp port 22 for health probes

Figure 30

click on NLB / Monitoring / Metrics and choose the Health Probe Status Metric.  From here you can see the staus of the Health Probes recieviced.  If it is at 100% then all is good.

Figure 31

also check for the Data Plane Availabilabilty Metric

Figure 32

So, now that we have confirmed that everything is looking good, let’s bring up a workload (VM) for WEB and one for APP.  Let’s first start with the APP VM.

Figure 33
Figure 34

Please make sure to put the tag key:value pair as you had defined in your MSO for EPG Selector policy for APP EPG

Figure 35

Once VM is created, go to the VM and look at Networking. Make a note of the Private IP (10.80.5.4) in my case.  Also observe that the security group is the default security grou ending in nsg

Figure 36

very soon the security group name will change and you will see new Inbound security group rules come in (based on the contracts you defined from MSO)

Figure 37

You can see this endpoint from cAPIC also by going to EPG/Endpoints

Figure 38

Repeat the same process to bring up a WEB-VM in the West Region.   Make sure to match the tag to what you configured as EPG Selector on MSO for EGP-WEB (in my case tier==web)

Once it’s ready you should be able to ssh in to WEB-VM.  Get the Public IP of the VM from Azure Console and looking at networking for WEB-VM

Figure 39

ssh to web-vm

Figure 40

From the web VM ping the APP VM Private IP

Figure 41

While the ping is running,  let’s do some debug catpures from the cloud CSRs. 

The commands on the CSRs to be used are as follows:

  • debug platform packet-trace packet 128
  • debug platform condition ipv4 10.70.5.4/32 both
  • debug platform condition start
  • debug platform packet-trace packet 128

To View:

  • show platform packet-trace statistics
  • show platform packet-trace summary

To Stop:

  • debug platform condition stop
Figure 42

From the debug you will observe that the ping packets are actually going through the cloud CSRs.

Figure 43

Currently CSR1 is not the datapath for these ping packets.  Some other source/destination hash will fall on this.

Figure 44

Analysing this, we will see that the packets are going from VRF-APP, East US region to the HUB NLB to CSR-0 through the vNET Peering and then through the vNET Peering to VRF-WEB,  West US region and then to WEB VM

Figure 45

Contrast this to what we saw in the TGW writeup.  You will notice that from VPC to VPC in AWS,  the data path does not hit the CSRs.  Also, there are no NLBs there either in infra.

Figure 49

Let’s also take a look at the Cloud CSR configurations that was pushed from cAPIC when we built our Tenant.

Figure 50

You will notice that there are quite a bit of configurations that got pushed down from cAPIC on to the cloud CSRs.   If you look at this, you will appreciate that this is rather complex configuration and would be hard to configure on a large scale by hand.  Cisco Cloud ACI is doing all this for you without your involvement.

  • BDIs got pushed for each of our VRFs.
  • Gig2 on the CSR got programmed with a route-map
  • The route-map matches the proper prefixes for each user VRF and inserts them in the proper BDI (proper user VRFs)
  • Based on the Contracts configured the proper Static routes are configured and then redistributed into the proper VRF
Figure 51
Figure 52

Network Watcher is another nice utility in Azure CLI that can be pretty handy

Figure 53
Figure 54

References:


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.