ACI/Cloud Extension Usage Primer (Azure) – Multi-Node Service Graph with North South Firewall Scaling using vNET peering and hosting service devices in HUB vNET (overlay-2)

In a previous article for Multinode Service Graphs with Horizontal Scaling of Firewalls for East/West traffic on Azure I had described and guided you step by step on how to configure and test that scenario.

I have had quite a few folks reach out to me and request that I do a similar writeup for North/South firewall scaling.  If you recall what we are trying to accomplish in the cloud  is the same thing that’s done in a phsyical Data Center when you used Clustered Firewall technology. 

Figure 1

Since Cloud Firewalls generally don’t offer clustering technology, we are going to achieve the same sort of results by placing a NLB/ALB in front of the Firewalls to achieve similar results.

In the Previous writeup of Multinode Service Graphs with Horizontal Scaling of Firewalls for East/West traffic on Azure I had listed a lot of different topologies for different use cases also and had done a deep dive on case # 7.   

In this writeup we are going to do a deep dive on Case #4 which covers North/South Firewall Scaling

Figure 4

As shown in the figure above, we are going to place our NLB and Firewalls in the HUB Network in the East Zone.  Recall from previous writeups on initial setup of cAPIC on Azure that we had configured our eastus region to be a hub.

Ofcourse for this use case we also had the option of doing everything in the tenant space also instead of using the hub as shown in Case#3 of Multinode Service Graphs with Horizontal Scaling of Firewalls for East/West traffic on Azure

Figure 5

As you can see from Case # 4, we have several options of placing devices:

Case4::Option # 1st device on Consumer side (Internet)  in HUB 2nd device in HUB 3rd device on Provider vNET
Option 1: ALB Firewalls ALB
Option 2: ALB Firewalls NLB
Option 3: NLB Firewalls NLB
Option 4: NLB Firewalls ALB

In this writeup I will do a deep dive on Option 3 and Option 4, since those are the more interesting cases. 

Spoiler alert:   Both Case4::Option3 and Case4::Option4 distributes traffic across the Firewalls.  However Case4::Option4 does a better job on the load balancing for the WEB Servers (the Web Farm).  If you want to jump the detials of the detailed configurations and just see the results, please go down all the way towards the bottom to the “Testing & Conclusion” section of this writeup.

Since I’ve written many articles that show the gory details of configuring every step, in this article I will show you the ones that are necessary but avoid boring you with every little step of the configuration.

Case4::Option3:

Before I start explaining and do a step by step guidance and testing, I want to point out some caveats, items you should be aware of:

  1. For this scenario to work  you need to have release 5.0.2h of cAPIC and also IP Based Selectors for the Firewall Data Interfaces.  This is due to the following bugs:
    • CSCvv19470: Tag Based selector does not work for non-redirect, multi-node service graph with NLB as 1st node and
    • bug CSCvu99161: extEPG -> NLB -> FW -> ALB -> provider – data path is not working. missing rules on FW )
    • In future release these bugs will be fixed and you can do tag based interfaces with this topology. 
  2. Since we are going to work with contracts and Service Graphs between the Infra Tenant and the user Tenant, we will do some initial configuration from the MSO.  After that we will do most of the configuration directly from the cAPIC.   This is because the Infra Tenant is not a Managed Tenant in MSO.

Below figure shows you the details of Case4:Option3. Recall from previous  article Simple Service Graph with Azure Network Load Balancer & vNET Peering, we had shown the packet flows differences between NLB and ALB and had discussed that NLB in Azure with cAPIC does not do SNAT.  ALB in Azure network with with cAPIC does source NAT.  

For that reason, I consider the topologies with NLB as the 1st device more interesting cases and want to show you how to configure those.  Using ALB for the 1st node will be very similar from a configuration prospective.

Figure 6

The figure below shows the packet flow details from Internet source to Web Farm.

Figure 7

The figure below shows the return packet coming back from the Web Farm to the Internet Source.  Pay particular attention to Step 6 and Step 8.  Remember that NLB with cAPIC  on Azure does not do SNAT (unlike ALB).  However the Azure fabric changes the source IP back to the NLB VIP IP.  I call this the Azure magic trick !

Figure 8

Step by Step Configurations and explanations:

First, go to MSO and configure your basic Tenant with a VRF in eastus (our hub region), a EPG, Contract with Any Filter.  No need to create an external EPG.  We will be using the HUB Vnet’s “all-internet” external EPG for Web Servers to go out.

In the Site Local Instantiation of the VRF, please make sure you turn on vNET Peering and also configure both the WEB Subnet (10.81.5.0/24) and the NLB Subnet (10.81.255.0/24).  Recall from previous writeups, I had mentioned several times that the NLB needs to be in it’s own subnet (subset of the main CIDR)

Figure 9

Also, for the WEB EPG, please make sure that the selector is IP Based.  In article Multinode Service Graphs with Horizontal Scaling of Firewalls for East/West traffic on Azure we had discussed the need for this in details.  As a recap this is so the UDRs are programmed properly on the Azure Fabric.

Figure 10

Now, on MSO, create your filter and Contract that we will apply from the HUB “all-internet” external EPG to the WEB EPG.  For the POC, I would suggest just doing a any Filter.

Figure 11

Now, let’s go to cAPIC and export that contract to the Infra vNET

Figure 12

From cAPIC go to EPG Communications and tie in the external EPG  “all-internet” to the exported contract

Figure 13

Now add the original Provider contract to the WEB EPG

Figure 14

On Azure Console, make sure to tie in 2 Standard SKU IPs for the WEB EPGs.  We will be spinning up 2 WEB Servers (figure below shows only 1 as an example.  Please create 2 of these)

Figure 15

Now, go ahead and spin up your 2 Web Servers.  Don’t forget to use the Standard SKU Public IPs that you created while spinning them up.

Figure 16

You should be able to ssh right into your WEB Servers now as shown in the figure below.  Remember we are using the external connectivity from the HUB Network (infra vNET) to reach the WEB servers.

Figure 17

From the WEB servers you will not be able to ping Internet destinations, because we don’t have a reverse contract from external EPG “all-internet” as provider and the WEB EPG as consumer.

Figure 18

Now from cAPIC let’s configure a contract in Infra Vnet that we will use for the traffic going out from Web Farm to Internet.   Also make sure that the contract is Global Scope and export it out to the Tenant

Figure 19

Apply the exported contract as consumer to the Tenant WEB EPG.

Figure 20

Now,  apply the original contract as provider to the HUB “all-internet”  external EPG

Figure 21

Outbound pings will now work as you see in the Figure below.  Note that all communication to and from the WEB Servers are going through the Infra VNET

Figure 22

Now that both WEB VMs are up and running and has egress and ingress connectivity, let’s go ahead and bring up web containers on them.  This should take you just a few minutes by following the instructions below and pulling the code of pre-made container images from my github repo.

Figure 23

For WEB VM1, do the below:

  1. From Azure console get the public IP of WEB VM1
  2. SSH to WEB VM1
  3. Run the below commands to get your web container ready (should take a few minutes only)
    • sudo  -i 
    • apt-get update && apt-get upgrade  -y 
    • echo net.ipv4.ip_forward=1 >> /etc/sysctl.conf 
    • sysctl  -p 
    • exit
    • sudo apt install docker.io -y
    • sudo systemctl start docker 
    • sudo systemctl enable docker 
    • sudo groupadd docker 
    • sudo usermod -aG docker $USER 
    • log out and ssh back in for this to work
    • docker  --version
    • sudo apt install docker-compose  -y
    • git clone https://github.com/soumukhe/aws-aci-lb-ec2-1.git
    • cd aws-aci-lb-ec2-1/
    • docker-compose up  --build -d 

Figure 24

For WEB VM2, do the below:

  1. From Azure console get the public IP of app VM2
  2. SSH to WEB VM2
  3. Run the below commands to get your web container ready (should take a few minutes only)
    • sudo  -i 
    • apt-get update && apt-get upgrade  -y 
    • echo net.ipv4.ip_forward=1 >> /etc/sysctl.conf 
    • sysctl  -p 
    • exit
    • sudo apt install docker.io -y
    • sudo systemctl start docker 
    • sudo systemctl enable docker 
    • sudo groupadd docker 
    • sudo usermod -aG docker $USER 
    • log out and ssh back in for this to work
    • docker  --version
    • sudo apt install docker-compose  -y
    • git clone https://github.com/soumukhe/aws-aci-lb-ec2-2.git
    • cd aws-aci-lb-ec2-2/
    • docker-compose up  --build -d 
Figure 25

Both these VMs are serving up nginx container on port tcp 9001 to the outside world.  So, from your local PC/MAC browse to the public IP of both these servers on port 9001 and you should  http://ip:9001 and you should see both WEB Server1 (blue) and WEB Server2 (green) UI show up

Figure 26

Now, it’s time to create the service devices/spin them up and create and attach the Service Graph.

Before doing that, let’s recall a few items.

Item 1:  External Connectivity for FW management Interface: We’ve discussed this  in a previous writeup Multinode Service Graphs with Horizontal Scaling of Firewalls for East/West traffic on Azure.  During that exercise we had made a EPG in HUB for FW-managment and had applied a contract provider/consumer for that EPG to the “all-internet”  External EPg.  That configuration is still there so I don’t have to do it again.  I’ll just put these firewall management interfaces on that same FW-Ext EPG.   Please look at that article if you need help on doing that.

Figure 28

Also, remember that we had tagged the FW-mgmt epg in HUB with  “tag==fw-mgmt”.  The idea is that when we spin up the 3rd party firewall from Azure Console, we will tag the manamgent interface of the firewall with the same tag, so the correct security groups will be applied on the Azure side, so we can get into the firewall (ssh/ui) to configure it.  Below diagram shows the tag that was configured on the “fw-mgmt” EPG (view from cAPIC)

Figure 29

Item 2:  Overlay-2 subnets for housing service devices:  Let’s also quickly take a look at the overlay-2 subnets that we defined for housing our service devices.  We did this during intial cAPIC setup. You can review that document here:  ACI/Cloud Extension Usage Primer (Azure) – ACI 5.0.2 cAPIC Feature Listing and First Time Setup differences

Below is a screenshot from cAPIC as a reference.

Figure 30

Now, let’s start creating the Service Devices.

To start off with, let’s create the HUB NLB (Internet Facing).

A screenshot of this is shown below for your guidance.

Figure 31

Next, let’s create the HUB Firewall.   A screenshot of this is shown below.

Figure 31a

After Clicking on “Add Interface”, please go ahead and create your data interfaces and make sure to use IP Selectors (not tag based selectors).  Keep in mind that the management interface is not configured here. 

Figure 32

Go ahead and also create the other interfaces.   I have a G03-notused interface also, which is optional.  The reason I did that is because in the ARM template that you use to spin up the firewall, there are 3 interfaces for ASA and I wanted to also tag it.

Figure 34

After the 3rd party firewall modelling is done, we need to create the Tenant Provider Side NLB.  this is shown below in the figure.  Make sure to choose Internal.   I also happened to choose a Dynamic IP, but you could choose a static IP.   Make sure that you choose the NLB-Subnet for this, do not choose the Workload subnet.

Figure 35

Now, it’s time to go to Azure Console and spin up the actual firewalls from the Azure ARM template.  Recall that in the writeup Multinode Service Graphs with Horizontal Scaling of Firewalls for East/West traffic on Azure,  I showed you how to do that.   If you have already done the previous exercise you will still have the template in your templates area.  If not, I am pasting the info on obtaining the templates here again.

There are several ways to install the 3rd party firewall in Azure.  In an upcoming release this will be even easier and you could do this directly from cAPIC itself and not even have to go to the Azure console.

For now, the most common way would be:

  1. In Azure console, go to market place and look for the Firewall you are interested in and subscribe to it and then deploy it.  However you will need to fix the parameters before you deploy to confirm to overlay-2 subnets you have defined.
  2. Deploy from a pre-made template just for this purpose.  If you did it this way you can make some minimum parameter changes in the json template file and then deploy.

If you use method 2, then you will first need to do a google search for the ARM template for your interested firewall.  Then get the json file for the template.   Then you can modify that json file in a text editor (like sublime or atom) before deploying.  You can also edit the parameters while deploying.

For the purpose of this exercise you can get the ASA template from my git repository.  Just click here for the ARM ASA template for cAPIC.

In that same repo, I also have a FTD template and a Palo Alto Networks Firewall template.  Click here for those

Below Figure shows you the ASA template in my git repo.

Figure 36

While deploying the Template, you have to fill in the fields based on your overlay-2 subnet setups.  Below is what I used for my first ASA setup.  (Remember this needs to go in Infra Subscription !  That’s where the hub is. )

Figure 37

After template is deployed, go the the ASA VM on Azure console and tag the 1st interface of the ASA with tag==fw-mgmt .  Recall our HUB EPG for FW-Mgmt was configued with selector of fw-mgmt.  The other interfaces were configured for IP based matching from cAPIC, so we don’t have to bother.

Figure 38

Now Spin up your 2nd ASA also and  tag the interfaces the same way as you did for the 1st ASA

Figure 39

Below is the filled in fields that I used for my ASA2.  Please note that the IP addresses are different for the interfaces.

Figure 40
Figure 41

Again tag the interface for the management for the 2nd ASA

Figure 42

Let’s go ahead and configure the ASAs.  Let’s go to the Serial Console and get into ASA1 first.   You will notice that the Management0/0 is already configured with DHCP

Figure 45

From the Console make the changes on the management 0/0 interface.  Take it off DHCP.  and make it management-only.  The reason we are doing this is because we want to add a default route to the outside interface of the ASA and we don’t want to get this mixed up with the management interface.  This is almost like having separate contexts on the ASA. Once you change the management interface, you can ssh in to the device and configure the rest.

Figure 46

Now Configure the access list and the DNAT/SNAT configuration on ASA1 as shown below

Figure 47

For your cut and paste pleasure:

access-list access-list-inbound extended permit icmp any any 
access-list access-list-inbound extended permit ip any any
access-list access-list-inbound extended deny ip any any log disable
access-list mgmt extended deny ip any any log disable

access-group access-list-inbound in interface outside
access-group access-list-inbound in interface inside

object network web-nlb-vip
  host 10.81.255.5
object service web-port
  service tcp destination eq www
object service web-port-translated
  service tcp destination eq 9001
nat (outside,inside) source dynamic any interface destination static interface web-nlb-vip service web-port web-port-translated

Check that SNAT/DNAT is configured correctly

Figure 48

Now let’s configure ASA2

Figure 49

Go into the Serial Console for ASA2

Figure 50

Configure the management interface and then configure the others from an SSH session to the ASA

Figure 51

Complete by configuring off the access-lists and NAT

Figure 52

For your cut and paste pleasure:
Note:  Below in my case you will notice that for “object network web-nlb-vip”  I am using host value of “10.81.255.5”.   That is because when I go check from Azure UI, I see that my front end NLB IP has acquired the IP of 10.81.255.5.   Please check what the Front-End IP is for your NLB and adjust accordingly.

access-list access-list-inbound extended permit icmp any any 
access-list access-list-inbound extended permit ip any any
access-list access-list-inbound extended deny ip any any log disable
access-list mgmt extended deny ip any any log disable

access-group access-list-inbound in interface outside
access-group access-list-inbound in interface inside

object network web-nlb-vip
  host 10.81.255.5
object service web-port
  service tcp destination eq www
object service web-port-translated
  service tcp destination eq 9001
nat (outside,inside) source dynamic any interface destination static interface web-nlb-vip service web-port web-port-translated

Verify that the NAT is configured correctly

Figure 53

Now, it’s time to configure the Service Graph using the devices that we configured in the previous steps

Figure 54

We’ll create the Service Graph from cAPIC

Figure 55

Now create the Service Graph as shown below.  Choose the 1st device as the consumer NLB

Figure 56

Select the HUB load balancer that you created earlier

Figure 57

Next Choose the Firewall as the 2nd device

Figure 58

Next, choose the Provider side NLB as the 3rd device

Figure 59

Our Service Graph Creation is done.  All we have to do now is to attach the Service Graph to the Provider side contract and configure the VIPs on the NLBs.

Figure 60

On cAPIC go to intent icon and then EPG communication

Figure 61

Select the Tenant Provider Side Contract

Figure 62

Add in the Service Graph that you created earlier

Figure 63

Double Click on the HUB NLB Icon

Figure 64

Configure the Listener as shown below.  Make sure the Health Checks are on port 22, since it goes to the ASAs.

Figure 65

There is nothing to configure from here on the 3rd party firewall.  Double Click on the Provider Side NLB

Figure 66

Click on Add Clould Load Balancer Listener

Figure 67

Configure the listener as shown below.  In this case we configured it as a HA port, meaning load balance for all tcp/udp ports

Figure 68

Before Testing let’s do some quick checks to make sure configurations are good

Figure 69

Here we are viewing the HUB NLB front end from the Azure Console

Figure 70

We also confirm that both ASAs are showing up on the backend pool

Figure 71

Health Probe is looking good

Figure 72

We now view the Tenant NLB Front End IP

Figure 73

We confirm that both the WEB Servers are showing up as the Target pool

Figure 74

Health Probe for the Tenant NLB looks good

Figure 75

Now, it’s time to do “Testing & Conclusion”

Figure 76

On pointing to the VIP of the HUB NLB from my local mac, I reach WEB Server 1. On every subsequent refresh I also reach WEB Server 1.  If you click on the “More Info” of the Web Server UI, you will see that the source IP of the packets coming to the WEB server is rotating between the 2 Firewalls

Figure 77

To make sure that WEB Server 2 (the green Server) is actually in the pool and working, I go ahead from my ssh session to WEB-VM1 and do a “docker-compose down”

Figure 78

Now, when I refresh my local browser, I reach the Green Server (WEB Server 2).  We also verify from the display that the packets are load balancing between the Firewalls.

Figure 79

Conclusion:  FW Scaling is working fine.  However with NLB on Provider Tenant,  WEB load balancing is not optimal in this topology.  This is because the packets to the NLB are coming from the same 2 IPs of the firewall, the fw inside interface of ASA1: 100.64.7.60 and ASA2: 100.64.7.61

Let’s replace the Provider Side NLB with ALB.   ALB, does load balancing on the application level, so, expectation is that it will load balance the WEB servers better also.

Figure 80

To test out our theory, let’s first go to Web Server 1 from console and turn on nginx and verify using curl that it’s serving the content

Figure 81

Next, from MSO, I create the ALB subnet in Provider VRF ( just add another subnet of 10.81.254.0/24).

Figure 82

The steps needed now to bring up the Service Graph with ALB on the Provider Tenant instead of NLB on the Provider Tenant is as follows:

  1. Detach the Service Graph from Provider Contract
  2. Delete the Service Graph
  3. Bring up the ALB, (Standard V2 SKU) ( make sure to put the ALB on the ALB subnet you brought up)
  4. Recreate the Service Graph with NLB(hub)—FW(hub)(dnat/snat)—ALB—Provider EPG
  5. Attach the Service Graph back to the Provider Contract
  6. Make slight modification to the ASA config to point to the new VIP of ALB
Figure 83

ASA object network config now has to be changed to point to the ALB VIP.  This is shown below

Figure 84

Below is the Listener configuration done on the ALB when applying the Service Graph to the Contract

Figure 85

Before testing,  please go to Azure console and look at the state of the ALB.  If the ALB health check says “Azure Application Gateway failed”  then AZ ApplicationGateway is  probably in updating state which it will show you also.
Please go to Azure Console and check “Frontend IP configurations, Backend Pools, Listeners, Health Probes” 

Figure 85a

On the Health Probe, click on the listener

Figure 85b

Click on Test

Figure 85c

The Health Check should show up good.

Figure 85d

Now It’s time to see the results by doing a browsing test.  Make sure to brows to http://ip  of hub nlb frontend ip

Figure 85e

Refreshing my local browser on the MAC now shows that the WEB Servers are also load balancing on every refresh.   Also, note from below that the Source IP of the packets is no longer the Firewall Inside Interfaces.  This is because the ALB does SNAT.  The SNAT IP is picked by Azure from the same subnet as the VIP Subnet.  In this case it happens to be 10.81.254.5, where as the VIP was 10.81.254.10.

Figure 86

References:


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.