User Application or Workload Monitoring on OpenShift Container Platform
Let’s assume a running application has been deployed to the RHOCP cluster inside a project (or namespace) called uat1, and that the Prometheus metrics endpoint is exposed on path /metrics.
In RHOCP 4.6, application monitoring can be set up by enabling monitoring for user-defined projects without the need to install an additional monitoring solution. This solution will deploy a second Prometheus Operator instance inside the openshift-user-workload-monitoring namespace that is configured to monitor all namespaces excluding the openshift- prefixed namespaces already monitored by the cluster's default platform monitoring.
Note: To understanding the monitoring stack for OpenShift Container Platform: https://docs.openshift.com/container-platform/4.6/monitoring/understanding-the-monitoring-stack.html
Let’s start the configuration as follows.
Step1: To enabling monitoring for user-defined projects:
Source Link: https://docs.openshift.com/container-platform/4.6/monitoring/enabling-monitoring-for-user-defined-projects.html
# oc -n openshift-monitoring edit configmap cluster-monitoring-config
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
config.yaml: |
enableUserWorkload: true
prometheusK8s:
retention: 24h
kind: ConfigMap
metadata:
creationTimestamp: "2021-09-27T12:00:54Z"
name: cluster-monitoring-config
namespace: openshift-monitoring
resourceVersion: "4912259"
selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config
uid: 4590cb83-99e3-404b-92da-ffdeacbccc0d
Note: To check the prometheus-operator, prometheus-user-workload and thanos-ruler-user-workload pods are running in the openshift-user-workload-monitoring project.
# oc -n openshift-user-workload-monitoring get pod
NAME READY STATUS RESTARTS AGE
prometheus-operator-646cb67c9-qbr8z 2/2 Running 0 3d21h
prometheus-user-workload-0 4/4 Running 1 11h
prometheus-user-workload-1 4/4 Running 1 11h
thanos-ruler-user-workload-0 3/3 Running 0 4d15h
thanos-ruler-user-workload-1 3/3 Running 0 4d11h
Step2: To add the necessary permission to your user using: (Optional)
Though cluster administrators can monitor all core OpenShift Container Platform and user-defined projects. you can grant developers and other users permission to monitor their own projects if required.
As an Example, To assign the user-workload-monitoring-config-edit role to a ocp4user1 in the openshift-user-workload-monitoring project:
# oc -n openshift-user-workload-monitoring adm policy add-role-to-user user-workload-monitoring-config-edit ocp4user1 --role-namespace openshift-user-workload-monitoring
Step3: To set up metrics collection for the application projects apart from projects named openshift-*:
We are going to use the prometheus-example-app example application and then to verify its available metrics to view. In this example, it is called sample-http-service.yaml to create a YAML file for the service configuration.
# cat sample-http-service.yaml
apiVersion: v1
kind: Namespace
metadata:
name: uat1
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: http-sample
name: http-sample
namespace: uat1
spec:
replicas: 1
selector:
matchLabels:
app: http-sample
template:
metadata:
labels:
app: http-sample
spec:
containers:
- image: ghcr.io/rhobs/prometheus-example-app:0.3.0
imagePullPolicy: IfNotPresent
name: http-sample
---
apiVersion: v1
kind: Service
metadata:
labels:
app: http-sample
name: http-sample
namespace: uat1
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
name: 8080-tcp
selector:
app: http-sample
type: ClusterIP
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
app: http-sample
name: http-sample
namespace: uat1
spec:
host: http-sample-uat1.apps.ocp-prod.jazakallah.info
port:
targetPort: 8080-tcp
to:
kind: Service
name: http-sample
weight: 100
wildcardPolicy: None
# oc apply -f sample-http-service.yaml
# oc get pod -n uat1
NAME READY STATUS RESTARTS AGE
http-sample-6b47b86c6d-wc54g 1/1 Running 0 8m48s
# oc get route -n uat1
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
http-sample http-sample-uat1.apps.ocp-prod.jazakallah.info http-sample 8080-tcp None
# curl http-sample-uat1.apps.ocp-prod.jazakallah.info
Hello from example application.
Now we have the exposed metrics through an HTTP service endpoint under the /metrics canonical name. We can list all available metrics for a service by running a curl query against http://<endpoint>/metrics.
For instance, We have exposed a route to the application http-sample and now run the following to view all of its available metrics:
# curl http-sample-uat1.apps.ocp-prod.jazakallah.info/metrics
# HELP http_request_duration_seconds Duration of all HTTP requests
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{code="200",handler="found",method="get",le="0.005"} 6
::::::::::::: CUT SOME OUTPUT :::::::::::::
http_request_duration_seconds_bucket{code="200",handler="found",method="get",le="+Inf"} 6
http_request_duration_seconds_sum{code="200",handler="found",method="get"} 7.1784e-05
http_request_duration_seconds_count{code="200",handler="found",method="get"} 6