AKS Monitoring Deep Dive — Part 1
Azure Kubernetes Service, or short AKS, is Microsoft Azure’s managed Kubernetes offering. Eventhough it is a managed service it still gives you plenty of options in regards to infrastructure and application monitor you should take into consideration for running workloads in production.
In this post we are mainly focussing on the native services and solutions offered in Microsoft Azure, this list is definitly not complete and does not cover most of the 3rd-Party solutions that are available. To get an overview of what is available, i recommend to take a look on landscape.cncf.io.
Before we dig deeper into the application space, let us start with what is available to monitor the infrastructure and service itself. AKS allows you to enable and configure basic monitoring capabilities right in the beginning of a cluster’s lifetime as part of the Azure Portal experience, Azure CLI or ARM.
As you can see in the screenshot below, “Container Monitoring” is enabled by default and allows you to attach your AKS cluster to an existing or new Log Analytics workspace:
This integration can be configured right away as part of the initial cluster deployment, or at a later time using the Portal, Azure CLI, ARM or other tooling. This integration deploys Azure’s OMSAgent (as a DaemonSet) to all cluster nodes in your AKS cluster. The OMSAgent then takes care of gathering and sending all relevant performance, health and logging information to the selected Log Analytics workspace.
This integration enables already a large number of monitoring capabilities for the cluster, its nodes, controllers and containers out of the box without any additional configuration. The following screenshots shows you the builtin Container Insights Cluster Dashboard:
Besides this high level performance overview, Container Insights provides us with with deeper information about certain areas of our cluster, for example into the node and its processes and containers:
Or into the containers, deployed to our cluster, itself:
Including the option to see foundational configuration options like the image, the image tag but also CPU and Memory Limits and Requests as well as environment variables:
This is a great addition to the recently released “Kubernetes resources” section, that allows you to see configuration settings like Namespaces, Workloads, Services and ingresses, Storage as well as Configuration right inside the Azure Portal, without using kubectl or the kubernetes-dashboard.
In addition to “Container Insights” does the “Kubernetes resources” view allow you to modify configuration settings directly from within the Azure Portal. See Access Kubernetes resources from the Azure portal — Azure Kubernetes Service | Microsoft Docs for more details about this functionality.
Coming back to “Container Insights”. As you have seen so far, AKS provides you out-of-the-box with a comprehensive monitoring solution based on Log Analytics and Azure Monitor.
In addition to the monitoring from within the Kubernetes cluster we have seen in the previous section using OMSAgent and Log Analytics, does AKS offer you the capabilities to gather way more data, especially logging data, from the cluster and its Microsoft-managed control plane using “Diagnostic settings” to store them in Log Analytics, Azure Storage or Azure Event Hubs as well.
Interesting to note here is that you can use one or more of these destinations at the same time, for example to configure different retention times using cheaper Azure Storage for long term retention or archiving while using Log Analytics only for short term retention.
Setting this configuration results in logfiles centrally stored in e.g. Azure Storage:
Or for example in Log Analytics, which can now queried directly:
In summary, AKS with its Azure Monitor integration provides powerful first-party monitoring capabilities out-of-the-box without additional tooling and configuration needed. The builtin dashboards and monitoring capabilities are a great start to gather relevant data in Log Analytics as a central store.
In Part 2 of this series we will take a deeper look into the advanced capabilities using Log Analytics and more. In Part 3 we will dive deeper into Prometheus Scraping with Azure Monitor for containers. See you there!