ACI Fabric in Google Cloud

Table of Contents:

  1. Introduction
  2. Theory behind ACI / Cloud Integration
    2a. Google ACI Fabric differences from AWS/Azure ACI Fabrics
  3. References

Introduction

It has been possible to integrate Cloud APIC in Google Cloud and create a ACI Fabric in Google Cloud since September of 2021. The latest image avaialable in Google Cloud Marketplace as of today April 2022 for cAPIC is release 25.0(3k)

Building ACI Cloud Fabrics in AWS and Azure and also integrating with the Physical ACI OnPrem Fabric has been supported for a long time now. In addtion Brownfield Integration and nonACI external data center integration into this mix is also supported. Google Cloud can also be integrated into the mix now.

In this writeup, I will discuss the theory behind the Google Cloud integration, some architectural differences compared to AWS/Azure ACI Fabric Integration. I will cover how to install, GCP ACI Fabric, Tenants and orchestrate with GCP Cloud ACI fabric in the mix in future writeups (as time permits).

Theory behind ACI / Cloud Integration

Now that everyone has a cloud presence and also onPrem Data Center Presence there is a need to seemlessly integrate all the Data Centers onPrem and clouds.

Every Cloud Provider has it’s own set of APIs. Thus if you were going to do the integration manually, you would have to use different methods of doing the integration for each cloud. In addition, you would have to do the orchestraton of the Infrastructure using different methods.

Cisco ACI Physical APICS ( the onPrem controllers) and Cloud APICs ( the cloud ACI Fabric controllers) each know the APIs and objects needed for it’s respective domain. Each of these APICs can orchestrate in it’s own domain.

Cisco Nexus Dashboard Orchestrator (NDO) is the glue that ties all these ACI Fabric Domains together. NDO sends a common set of API commands to each APIC controller and each APIC controller knows the native APIs required to program it’s own domain and does the job. At the end you have a single pane of glass for orchestration ! The figure below gives you an illustration of this.

file
Figure 1: High Level overview of ACI Fabric

As mentioned earlier, each cloud provider has it’s own set of apis and objects. The APICs (physical and cloud) use a common set of ACI objects and do the respective translation in each domain. So, as long as you are familiar with the ACI objects, the APICs do their job for their domain.
Let’s take as an example of a high level Tenant Object. The following will be the objects in their respective domains.

Domain ACI Object Corresponding Domain Object
onPrem ACI Fabric Tenant Tenant
AWS ACI Fabric Tenant AWS Account
Azure ACI Fabric Tenant Azure Subscription
GCP ACI Fabric Tenant GCP Project

It’s imporant to remember that:

  • for Azure ACI Fabric, the Infra Tenant (where we house the cAPICS and c8KV routers) and tenants piece of the fabric can be spun up on the same Azure subscription.
  • for AWS ACI Fabric, the Infra Tenant (where we house the cAPICS and c8KV routers) and tenants piece of the fabric has to be in different AWS Accounts. You will need a separate AWS Account for each Tenant.
  • for GCP ACI Fabric, the Infra Tenant (where we house the cAPICS and c8KV routers) and tenants piece of the fabric has to be in different GCP Projects. You will need a separate Project for each Tenant.

The diagram below is very handy for comparision of objects in the different domains.
file
Figure 2: Comparing ACI Fabric Object mappings to it’s corresponding Domain

Google ACI Fabric differences from AWS/Azure ACI Fabrics

  • In the AWS/Azure ACI Fabric we do the following:
  • For onPrem to Cloud, build BGP evpn session from onPrem ACI Spines to Cloud Routers for control plane (prefix exhchanges) betwen the sites. For the Data Plane we use VXLAN encapsulation.
  • The control plane and data plane traffic goes inside IPSec Tunnels in the case of connecting the sites using Internet
  • The control plane and data plane traffic can optionally go through ipSec tunnels if desired if using AWS Direct Connect/Azure Express Route
  • For Cloud to Cloud, ACI Fabric we build BGP evpn control plane sessions between cloud routers. We build vxlan tunnnels for data plane. All traffic goes over IPSec tunnels.
  • Now, using NDO you can stretch your policies seemlessly across sites to have a consistent policy. Each domain Controller will do the necessary orchestration for it’s domain.

The 2 high level diagram illustrate these concepts.
file
Figure 3: High Level model on how interconnect works.

file
Figure 4: High Level Model on how interconnect works with Direct Gateway.

  • For Google Cloud ACI Integration currently we do not support BGP EVPN Control Plane and VXLAN Data Plane. These features will come in a newer release around the June/July 2022 time frame.
  • We don’t spin up Cisco 8KV Cloud routers. Rather, we use the GCP Cloud Native Router in the Infra Tenant. We create IPSec tunnels from the GCP Cloud Native Router. Inside the IPSec tunnel we establish ipv4 BGP sessions for control plane. The data plane is native IPv4 traffic inside the IPSec tunnel.
  • We do not support GCP Cloud Interconnect (the equivalent of AWS Direct Connect or Azure Express Route)
  • Also, as mentioned perviously unlike Azure and much like AWS, GCP ACI Fabric requires a GCP subscription to house the ACI Infra Tenant and each user Tenant needs to have a separate GCP Project.

The implications of not supporting BGP EVPN Control Plane and VXLAN Data Plane for GCP ACI Fabric ( this feature will come in around June/July 2022 time-frame)

  • With BGP EVPN Control Plane and VXLAN Data Plane support (AWS/Azure case), the connectivity to the onPrem Fabric (if having onPrem in the mix), is from cloud routers (C8KV) to the ACI Spines.
  • With BGP IPv4 control Plane and IPv4 Native support, the connectivity to the onPrem Fabric (if having onPrem in the mix), is from Native Routers to the ACI Leaf L3Outs. Each Tenant will need to have it’s own L3Out in it’s own Vlan on the L3Out side and peer with a routing protocol to the edge L3Out Routers in the physical site. The L3Out Router can then create the sessions to the GCP Native routers. The configuration on the onPrem router required for establishing IPSec tunnels and BGP sessions are downloadable from NDO.
  • Similarly for Cloud to Cloud, the ipSec Tunnels,BGP go from Cloud Router to Cloud Router. The only difference is that the configuaration is fully automated. You don’t need to download configuration and apply to any router as in the onPrem case.

A depiction of this is shown in the figure below.

file
Figure 5: A Depiction of Connectivity between Sites with Route Leaking

  • While doing the orchestration for Tenants, because of no BGP EVPN and VXLAN support currently, the tenants are considered site local tenants only. Meaning you cannot stretch policies. You have to bring up Tenants on each domain (using NDO) and create local policies for that tenant, then using NDO you will have to create the route leaking necessary from the Tenant space to the L3Out, so that prefixes can be exchanged to other sites to enable communications. The figure below shows the template type (inside the NDO Schema) of Cloud Local only that needs to be chosen when having GCP ACI Fabric in the mix.
    file
    Figure 6: NDO Cloud Local Template (inside Schema)

  • All connectivity to other sites, whether it be phyACIFabric, or non-ACI DC is all considered external conntecitivy. In fact, the type of connecitivy is exactly the same as describend in my previous article: Cisco Cloud ACI Generic External Connectivity

  • Another point to note is that ACI Contracts has been decopuled so that it does not control Security Policy + Routing. ACI Contracts control Security and Route Maps (which automatically get configured when you configure route leaking), define the routing policy.

  • The figure below shows the setting in the cAPIC initial setup screen that controls this. For GCP cAPIC , you cannot toggle this to on currently. This will be toggleable (not sure if that’s a word, but I like it) in the future.
    file
    Figure 7: Routing Policy and Security Policy Controls decoupled.

References

Getting Started with Cisco Application Centric Infrastructure
Cisco Cloud APIC for Google Cloud Installation Guide, Release 25.0(x)
Go To TOP


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.