Cisco ACI APIC and cAPIC monitoring using Prometheus/Grafana

From Release 5.3 of APIC and release 25.x of cAPIC  Promethus Node Exporter Feature is available.   This allows you to monitor statistics from Prometheus and view pre-configured dashboards from Grafana to get quick status of APICs/cAPICs.

In this write up,  I will show you how to set  up and bring up Prometheus/Grafana to monitor and visualize stats so you can try this out for yourself for POC

What is Prometheus/Grafana:

  • Prometheus is an Open Source monitoring solution
  • Started at SoundCloud around 2012-2013 and was made public in early 2015
  • It was inspired by Google’s Borgman, which uses time-series data as a datasource, to send alerts based on the data
  • Prometheus provides Metrics & Alerting.  It uses Dimensional Data, time series are identified by metric name and a set of key/value pairs
  • Prometheus stores metrics in memory and local disk in an own custom, efficient format
  • Prometheus is written in GO
  • Prometheus includs a Flexible Query Language
  • Visualizations can be shown using a built-in-expression browser or with Grafana integration.
  • There are tons of pre-built Grafana dashboards that you can download easily by the dashboard number from https://grafana.com/grafana/dashboards
  • It fits very well in the cloud native infrastructure
  • Many client Libraries and Integrations are available
  • Prometheus collects metrics from monitored targets by scraping metrics (http endpoints)
  • A single Prometheus server is able to ingest up to 1 million samples per second as serveral million time series

Let’s get started on this, so you can try this out for yourself for a POC

First, let’s go ahead and enable Prometheus Node Exporter in cAPIC 25.x  (this will be supported in APIC 5.3 and the procedure will be exactly the same)

on APIC/cAPIC go to System Configuration / Management Access and click on the edit button for https as shown in the figure below

Figure 1: Enabling Prometheus Node Exporter on APIC/cAPIC step1

On the next page do the below:

  • Enable Node Exporter
  • use port tcp 443
  • Allow Origins:  https://127.0.0.1:8000
  • Confirm that the web service on APIC/cAPIC will reboot (no disruption to anything, other than tempoary disruption on your connection to APIC/cAPIC GUI)
Figure 2: Enabling Promethes on APIC/cAPIC step 2

Repeat the steps for all APIC 5.3 / cAPIC 25.x that you wish to monitor

Now let’s go ahead and install docker containers for bringing up Prometheus/Grafana

1) You need to have a VM that's already running docker and docker-compose. In case you don't have this and need help, please see the bottom part or the readme file at: https://github.com/soumukhe/apic-prometheus#readme
2) git clone https://github.com/soumukhe/apic-prometheus.git
3) cd apic-prometheus/prometheus
4) vi prometheus.yml and modify line 65, 69, 74, 78 as appropriate
5) Add more APICs/cAPICs as appropriate as indicated in line 82
Figure 3: Modifying the prometheus.yml file (step 4 & 5 above)
6) docker-compose up -d
7) execute the command: docker ps --format '{{ .Names }}' | sort |nl
(make sure you have the 5 below containers running)
1 apic-prometheus_alertmanager_1
2 apic-prometheus_cadvisor_1
3 apic-prometheus_grafana_1
4 apic-prometheus_node-exporter_1
5 apic-prometheus_prometheus_1

Prometheus is all setup and ready to go.  Now browse to the IP of the Ubuntu VM on port 9090 using http.  In my case I browse to http://10.1.100.13:9090.

The Prometheus interface will come up.   In the Expressions type in “node_memory_Active_anon_bytes”  and you can see the statistics for that in Table format or Graph format.

Notice that as you type in the Expressions field, you will be presented with all the different statistics parameters that you can look at.  Please look at different statistics to see which ones can be important for you.

Fkgure 3: Looking at Prometheus Graph for node_memory_Active_anon_bytes

Now, let’s go to Grafana and create a dashboard:

Point your browser to  http://<ip_of_ubuntu_vm>:3000.  
In my case since my vm ip is 10.1.100.13, 
I browsed to http://10.1.100.13:3000.  
The username to login is admin and the password is cisco.  
The password value is configured in grafana/config.monitoring file
Figure 4: Browsing to Grafana

Now, click on Data sources and  Add data source

Figure 5: Adding data source to Grafana

As you can see you can pull in your data from multiple sources (even Elasticsearch).  For this example, we’ll use Prometheus

Figure 6: Adding Prometheus Data Source

Fill in some name and the IP_of_your_ubuntu VM:9090.  I my case,  I used myApics and  http://10.1.100.13:9090.&nbsp; Then click Save & test

Figure 7: Populating data source

You should see a confirmation at the bottom indicating that “Data source is working:

Figure 8: confirmation that Data source is working

Also, Notice that you can add many plugins as shown below:

Figure 8a: Adding Plugins

Now, Click on + and Dashboard, then click on Add an empty panel

Figure 9: Creating dashboards with panel
  • Data source:  myApics
  • Metrics browser:  up
  • Legend {{job}}
  • Resoution 1/1
  • Format:  Time series
  • Instant: on
  • Type: Guage
  • Title: APIC up/down
  • Click Apply
Figure 10: Creating 1st Panel

drag the panel from the edge to fill the entire row as shown below:

Figure 11: dragging the panel to fill entire row

Next add another panel to the dashboard by hitting the add panel button as shown below

Figure 12: Add Panel

For this Panel, make the followig selections:

  • Data source:  myApics
  • Metrics browser:  node_disk_read_bytes_total
  • Legend: 
  • Resoution 1/1
  • Format:  Time series
  • Instant: off
  • Type: Time Series
  • Title: Disk Bytes Read
  • Click Apply

Next add another panel:

  • Data source:  myApics
  • Metrics browser:  go_memstats_alloc_bytes_total
  • Legend: 
  • Resoution 1/1
  • Format:  Time series
  • Instant: off
  • Type: Time Series
  • Title: APIC_memstats_alloc_bytes_total
  • Click Apply

Next add another panel:

  • Data source:  myApics
  • Metrics browser:  node_memory_Active_file_bytes
  • Legend: 
  • Resoution 1/1
  • Format:  Time series
  • Instant: off
  • Type: Time Series
  • Title: node_memory_Active_file_bytes
  • Click Apply

You can move the panels around as you wish.  You can also create collapsible panels if you wanted to.

Once, done,  with configuring panels, save it by hitting the Save button

Figure 13: Saving Dashboard

Your Final Dashboard should look like shown below:

Figure 14: Final Grafana Dashboard View

References:

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/Monitoring-Metrics-Using-Prometheus-Node-Exporter.html
https://www.cisco.com/c/en/us/support/cloud-systems-management/cloud-application-policy-infrastructure-controller/series.html
https://www.cisco.com/c/en/us/td/docs/dcn/aci/cloud-apic/25x/release-notes/cisco-cloud-apic-release-notes-2501.html
https://grafana.com/docs/grafana/latest/introduction/
https://prometheus.io/docs/introduction/overview/
https://grafana.com/grafana/dashboards/


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.