Monitoring an Azure Kubernetes Service (AKS) cluster with Prometheus and Grafana

 

Monitoring an Azure Kubernetes Service (AKS) cluster with Prometheus and Grafana involves setting up these tools to collect and visualize metrics from your cluster. Here’s a step-by-step guide to achieve this:

Prerequisites

  • An AKS cluster
  • kubectl configured to access your AKS cluster
  • Helm installed on your local machine

Step 1: Install Prometheus and Grafana using Helm

  1. Add the Helm repositories for Prometheus and Grafana

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo add grafana https://grafana.github.io/helm-charts helm repo update
  1. Create a namespace for monitoring

kubectl create namespace monitoring
  1. Install Prometheus

helm install prometheus prometheus-community/prometheus --namespace monitoring
  1. Install Grafana

helm install grafana grafana/grafana --namespace monitoring

Step 2: Configure Prometheus

Prometheus should be configured to scrape metrics from your AKS cluster. The Helm chart will set up a basic configuration that scrapes metrics from the Kubernetes API server and kubelet.

  1. Check the Prometheus pods

kubectl get pods -n monitoring -l app=prometheus
  1. Access Prometheus

You can access the Prometheus UI using a port forward:


kubectl port-forward -n monitoring deploy/prometheus-server 9090

Open your browser and go to http://localhost:9090.

Step 3: Configure Grafana

  1. Check the Grafana pods

kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana
  1. Access Grafana

Grafana can also be accessed using a port forward:


kubectl port-forward -n monitoring svc/grafana 3000:80

Open your browser and go to http://localhost:3000.

  1. Log in to Grafana
  • The default username is admin.
  • Retrieve the default password:

kubectl get secret -n monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode

Step 4: Add Prometheus as a Data Source in Grafana

  1. Log in to Grafana and navigate to Configuration > Data Sources.
  2. Add a new data source and select Prometheus.
  3. Set the URL to http://prometheus-server.monitoring.svc.cluster.local:80.
  4. Click on Save & Test.

Step 5: Import Grafana Dashboards

  1. Go to Create > Import in Grafana.
  2. You can find many pre-built dashboards for Kubernetes in the Grafana dashboard directory. For example, you can use dashboard ID 315 for a Kubernetes cluster monitoring dashboard.
  3. Enter the dashboard ID and click on Load.
  4. Select Prometheus as the data source and click Import.

Step 6: Configure Alerting (Optional)

  1. Prometheus Alerting Rules

You can configure alerting rules in Prometheus by editing the values.yaml file used in the Helm chart or by creating a ConfigMap with your custom rules and attaching it to the Prometheus deployment.

  1. Grafana Alerts

Grafana also supports alerting. You can configure alerts based on the metrics visualized on your dashboards.

Example Configuration Files

Prometheus ConfigMap (Optional Custom Configuration)

You can create a custom ConfigMap to override the default scrape configuration:

yaml

apiVersion: v1 kind: ConfigMap metadata: name: prometheus-server-conf namespace: monitoring labels: app: prometheus data: prometheus.yml: |- global: scrape_interval: 15s scrape_configs: - job_name: 'kubernetes-apiservers' kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] action: keep regex: default;kubernetes;https - job_name: 'kubernetes-nodes' kubernetes_sd_configs: - role: node - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod - job_name: 'kubernetes-cadvisor' kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/${1}:10250/proxy/metrics/cadvisor - source_labels: [__address__] target_label: __param_target replacement: https://${1}:443/metrics

Apply the ConfigMap and update the Prometheus deployment to use it.

Comments