Sometimes we wonder, why we pay this much amount to the Cloud provider/s. While building the infrastructure, TCO is one of the critical factor. TCO means, Total Cost of Ownership. We need to do a lot of things to maintain a good TCO. There are a lot of online discussions available on cost optimisation.
In this article, we are discussing one of the item which can help you to maintain a decent number of worker nodes by tweaking the application requests. In Kubernetes, to manage the resources effectively, we use the request and limit features. When you specify the resource request for containers in a Pod, the kube-scheduler uses this information to decide which node to place the Pod on.
Request is more critical with respect to COST optimisation. If the request if wrongly configured (for example, you configured high value for requests), it can provision nodes to allocate the PODs with high resource requests and eventually you will be wasting the compute resources allocated for your Kubernetes cluster. So the money that you’re paying to the cloud provider is a loss for you.
In most cases (in may articles) of COST optimisation, the resource part is not mentioned. You must review the resources allocated, requested and used for the Namespaces in your Kubernetes cluster. Doing this manually is painful. We can simply fetch all this information from the metrics. We can create some dashboards to show these information.
So we are covering following:
- Grafana dashboard to see overall CPU / Memory / Ephemeral disk usage of your K8s cluster.
- Grafana dashboard / table to list the current CPU usage, CPU request, current CPU request usage percentage, average CPU request usage percentage, resource limit etc.
- Grafana dashboard to see overall CPU / Memory / POD usage of your Namespace.
- Grafana dashboard to analyse the PODs resources usage in a K8s Namespace.
There are many dashboards are available for this. As part of some recent analysis, I created some dashboards which tell the resource summary on your Kubernetes cluster plus the Namespace. This has mainly three dashboards. L1, L2 and L3. You will get an overview from the L1 dashboard, later you can drilldown to L2 and L3.
You can find the dashboards’ code from the Github page. Try implementing this and share your feedback.
Feel free to contribute to this project: