Strategies for Cost Optimization Across Amazon EKS Clusters

Saurabh Taneja

Cloud & DevOps

Tags:

eks

AWS

kubecost

finops

kubernetes

Fast-growing tech companies rely heavily on Amazon EKS clusters to host a variety of microservices and applications. The pairing of Amazon EKS for managing the Kubernetes Control Plane and Amazon EC2 for flexible Kubernetes nodes creates an optimal environment for running containerized workloads.

With the increasing scale of operations, optimizing costs across multiple EKS clusters has become a critical priority. This blog will demonstrate how we can leverage various tools and strategies to analyze, optimize, and manage EKS costs effectively while maintaining performance and reliability.

Cost Analysis:

Working on cost optimization becomes absolutely necessary for cost analysis. Data plays an important role, and trust your data. The total cost of operating an EKS cluster encompasses several components. The EKS Control Plane (or Master Node) incurs a fixed cost of $0.20 per hour, offering straightforward pricing.

Meanwhile, EC2 instances, serving as the cluster's nodes, introduce various cost factors, such as block storage and data transfer, which can vary significantly based on workload characteristics. For this discussion, we'll focus primarily on two aspects of EC2 cost: instance hours and instance pricing. Let’s look at how to do the cost analysis on your EKS cluster.

Tool Selection: We can begin our cost analysis journey by selecting Kubecost, a powerful tool specifically designed for Kubernetes cost analysis. Kubecost provides granular insights into resource utilization and costs across our EKS clusters.
Deployment and Usage: Deploying Kubecost is straightforward. We can integrate it with our Kubernetes clusters following the provided documentation. Kubecost's intuitive dashboard allowed us to visualize resource usage, cost breakdowns, and cost allocation by namespace, pod, or label. Once deployed, you can see the Kubecost overview page in your browser by port-forwarding the Kubecost k8s service. It might take 5-10 minutes for Kubecost to gather metrics. You can see your Amazon EKS spend, including cumulative cluster costs, associated Kubernetes asset costs, and monthly aggregated spend.

Cluster Level Cost Analysis: For multi-cluster cost analysis and cluster level scoping, consider using the AWS Tagging strategy and tag your EKS clusters. Learn more about tagging strategy from the following documentations. You can then view your cost analysis in AWS Cost Explorer. AWS Cost Explorer provided additional insights into our AWS usage and spending trends. By analyzing cost and usage data at a granular level, we can identify areas for further optimization and cost reduction.
Multi-Cluster Cost Analysis using Kubecost and Prometheus: Kubecost deployment comes with a Prometheus cluster to send cost analysis metrics to the Prometheus server. For multiple EKS clusters, we can enable the remote Prometheus server, either AWS-Managed Prometheus server or self-managed Prometheus. To get cost analysis metrics from multiple clusters, we need to run Kubeost with an additional Sigv4 pod that sends individual and combined cluster metrics to a common Prometheus cluster. You can follow the AWS documentation for Multi-Cluster Cost Analysis using Kubecost and Prometheus.

Cost Optimization Strategies:

Based on the cost analysis, the next step is to plan your cost optimization strategies. As explained in the previous section, the Control Plane has a fixed cost and straightforward pricing model. So, we will focus mainly on optimizing the data nodes and optimizing the application configuration. Let’s look at the following strategies when optimizing the cost of the EKS cluster and supporting AWS services:

Right Sizing: On the cost optimization pillar of the AWS Well-Architected Framework, we find a section on Cost-Effective Resources, which describes Right Sizing as:

“… using the lowest cost resource that still meets the technical specifications of a specific workload.”

Application Right Sizing: Right-sizing is the strategy to optimize pod resources by allocating the appropriate CPU and memory resources to pods. Care must be taken to try to set requests that align as close as possible to the actual utilization of these resources. If the value is too low, then the containers may experience throttling of the resources and impact the performance. However, if the value is too high, then there is waste, since those unused resources remain reserved for that single container. When actual utilization is lower than the requested value, the difference is called slack cost. A tool like kube-resource-report is valuable for visualizing the slack cost and right-sizing the requests for the containers in a pod. Installation instructions demonstrate how to install via an included helm chart.

helm upgrade --install kube-resource-report chart/kube-resource-report

You can also consider tools like VPA recommender with Goldilocks to get an insight into your pod resource consumption and utilization.
Compute Right Sizing: Application right sizing and Kubecost analysis are required to right-size EKS Compute. Here are several strategies for computing right sizing:some text
- Mixed Instance Auto Scaling group: Employ a mixed instance policy to create a diversified pool of instances within your auto scaling group. This mix can include both spot and on-demand instances. However, it's advisable not to mix instances of different sizes within the same Node group.
- Node Groups, Taints, and Tolerations: Utilize separate Node Groups with varying instance sizes for different application requirements. For example, use distinct node groups for GPU-intensive and CPU-intensive applications. Use taints and tolerations to ensure applications are deployed on the appropriate node group.
- Graviton Instances: Explore the adoption of Graviton Instances, which offer up to 40% better price performance compared to traditional instances. Consider migrating to Graviton Instances to optimize costs and enhance application performance.

Purchase Options: Another part of the cost optimization pillar of the AWS Well-Architected Framework that we can apply comes from the Purchasing Options section, which says:

“Spot Instances allow you to use spare compute capacity at a significantly lower cost than On-Demand EC2 instances (up to 90%).”

Understanding purchase options for Amazon EC2 is crucial for cost optimization. The Amazon EKS data plane consists of worker nodes or serverless compute resources responsible for running Kubernetes application workloads. These nodes can utilize different capacity types and purchase options, including On-Demand, Spot Instances, Savings Plans, and Reserved Instances.

On-Demand and Spot capacity offer flexibility without spending commitments. On-Demand instances are billed based on runtime and guarantee availability at On-Demand rates, while Spot instances offer discounted rates but are preemptible. Both options are suitable for temporary or bursty workloads, with Spot instances being particularly cost-effective for applications tolerant of compute availability fluctuations.

Reserved Instances involve upfront spending commitments over one or three years for discounted rates. Once a steady-state resource consumption profile is established, Reserved Instances or Savings Plans become effective. Savings Plans, introduced as a more flexible alternative to Reserved Instances, allow for commitments based on a "US Dollar spend amount," irrespective of provisioned resources. There are two types: Compute Savings Plans, offering flexibility across instance types, Fargate, and Lambda charges, and EC2 Instance Savings Plans, providing deeper discounts but restricting compute choice to an instance family.

Tailoring your approach to your workload can significantly impact cost optimization within your EKS cluster. For non-production environments, leveraging Spot Instances exclusively can yield substantial savings. Meanwhile, implementing Mixed-Instances Auto Scaling Groups for production workloads allows for dynamic scaling and cost optimization. Additionally, for steady workloads, investing in a Savings Plan for EC2 instances can provide long-term cost benefits. By strategically planning and optimizing your EC2 instances, you can achieve a notable reduction in your overall EKS compute costs, potentially reaching savings of approximately 60-70%.

Auto Scaling: The cost optimization pillar of the AWS Well-Architected Framework includes a section on Matching Supply and Demand, which recommends the following:

“… this (matching supply and demand) accomplished using Auto Scaling, which helps you to scale your EC2 instances and Spot Fleet capacity up or down automatically according to conditions you define.”

Cluster Autoscaling: Therefore, a prerequisite to cost optimization on a Kubernetes cluster is to ensure you have Cluster Autoscaler running. This tool performs two critical functions in the cluster. First, it will monitor the cluster for pods that are unable to run due to insufficient resources. Whenever this occurs, the Cluster Autoscaler will update the Amazon EC2 Auto Scaling group to increase the desired count, resulting in additional nodes in the cluster. Additionally, the Cluster Autoscaler will detect nodes that have been underutilized and reschedule pods onto other nodes. Cluster Autoscaler will then decrease the desired count for the Auto Scaling group to scale in the number of nodes.

‍The Amazon EKS User Guide has a great section on the configuration of the Cluster Autoscaler. There are a couple of things to pay attention to when configuring the Cluster Autoscaler:

IAM Roles for Service Account – Cluster Autoscaler will require access to update the desired capacity in the Auto Scaling group. The recommended approach is to create a new IAM role with the required policies and a trust policy that restricts access to the service account used by Cluster Autoscaler. The role name must then be provided as an annotation on the service account:

CODE: https://gist.github.com/velotiotech/96cf3871358781aab8f3d19befc24876.js

Auto-Discovery Setup

Setup your Cluster Autoscaler in Auto-Discovery Setup by enabling the --node-group-auto-discovery flag as an argument. Also, make sure to tag your EKS nodes’ Autoscaling groups with the following tags:

k8s.io/cluster-autoscaler/enabled,
k8s.io/cluster-autoscaler/<cluster-name>

Auto Scaling Group per AZ – When Cluster Autoscaler scales out, it simply increases the desired count for the Auto Scaling group, leaving the responsibility for launching new EC2 instances to the AWS Auto Scaling service. If an Auto Scaling group is configured for multiple availability zones, then the new instance may be provisioned in any of those availability zones.

For deployments that use persistent volumes, you will need to provision a separate Auto Scaling group for each availability zone. This way, when Cluster Autoscaler detects the need to scale out in response to a given pod, it can target the correct availability zone for the scale-out based on persistent volume claims that already exist in a given availability zone.

When using multiple Auto Scaling groups, be sure to include the following argument in the pod specification for Cluster Autoscaler:

--balance-similar-node-groups=true

Pod Autoscaling: Now that Cluster Autoscaler is running in the cluster, you can be confident that the instance hours will align closely with the demand from pods within the cluster. Next up is to use Horizontal Pod Autoscaler (HPA) to scale out or in the number of pods for a deployment based on specific metrics for the pods to optimize pod hours and further optimize our instance hours.

The HPA controller is included with Kubernetes, so all that is required to configure HPA is to ensure that the Kubernetes metrics server is deployed in your cluster and then defining HPA resources for your deployments. For example, the following HPA resource is configured to monitor the CPU utilization for a deployment named nginx-ingress-controller. HPA will then scale out or in the number of pods between 1 and 5 to target an average CPU utilization of 80% across all the pods:

CODE: https://gist.github.com/velotiotech/c1efc311e8073d2fabea926f596c8b15.js

The combination of Cluster Autoscaler and Horizontal Pod Autoscaler is an effective way to keep EC2 instance hours tied as close as possible to the actual utilization of the workloads running in the cluster.

‍

Down Scaling: In addition to demand-based automatic scaling, the Matching Supply and Demand section of the AWS Well-Architected Framework cost optimization pillar includes a section, which recommends the following:

“Systems can be scheduled to scale out or in at defined times, such as the start of business hours, thus ensuring that resources are available when users arrive.”

There are many deployments that only need to be available during business hours. A tool named kube-downscaler can be deployed to the cluster to scale in and out the deployments based on time of day.

Some example use case of kube-downscaler is:

Deploy the downscaler to a test (non-prod) cluster with a default uptime or downtime time range to scale down all deployments during the night and weekend.
Deploy the downscaler to a production cluster without any default uptime/downtime setting and scale down specific deployments by setting the downscaler/uptime (or downscaler/downtime) annotation. This might be useful for internal tooling front ends, which are only needed during work time.
AWS Fargate with EKS: You can run Kubernetes without managing clusters of K8s servers with AWS Fargate, a serverless compute service.

AWS Fargate pricing is based on usage (pay-per-use). There are no upfront charges here as well. There is, however, a one-minute minimum charge. All charges are also rounded up to the nearest second. You will also be charged for any additional services you use, such as CloudWatch utilization charges and data transfer fees. Fargate can also reduce your management costs by reducing the number of DevOps professionals and tools you need to run Kubernetes on Amazon EKS.

Conclusion:

Effectively managing costs across multiple Amazon EKS clusters is essential for optimizing operations. By utilizing tools like Kubecost and AWS Cost Explorer, coupled with strategies such as right-sizing, mixed instance policies, and Spot Instances, organizations can streamline cost analysis and optimize resource allocation. Additionally, implementing auto-scaling mechanisms like Cluster Autoscaler ensures dynamic resource scaling based on demand, further optimizing costs. Leveraging AWS Fargate with EKS can eliminate the need to manage Kubernetes clusters, reducing management costs. Overall, by combining these strategies, organizations can achieve significant cost savings while maintaining performance and reliability in their containerized environments.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Strategies for Cost Optimization Across Amazon EKS Clusters

Cost Analysis:

Tool Selection: We can begin our cost analysis journey by selecting Kubecost, a powerful tool specifically designed for Kubernetes cost analysis. Kubecost provides granular insights into resource utilization and costs across our EKS clusters.
Deployment and Usage: Deploying Kubecost is straightforward. We can integrate it with our Kubernetes clusters following the provided documentation. Kubecost's intuitive dashboard allowed us to visualize resource usage, cost breakdowns, and cost allocation by namespace, pod, or label. Once deployed, you can see the Kubecost overview page in your browser by port-forwarding the Kubecost k8s service. It might take 5-10 minutes for Kubecost to gather metrics. You can see your Amazon EKS spend, including cumulative cluster costs, associated Kubernetes asset costs, and monthly aggregated spend.

Cluster Level Cost Analysis: For multi-cluster cost analysis and cluster level scoping, consider using the AWS Tagging strategy and tag your EKS clusters. Learn more about tagging strategy from the following documentations. You can then view your cost analysis in AWS Cost Explorer. AWS Cost Explorer provided additional insights into our AWS usage and spending trends. By analyzing cost and usage data at a granular level, we can identify areas for further optimization and cost reduction.
Multi-Cluster Cost Analysis using Kubecost and Prometheus: Kubecost deployment comes with a Prometheus cluster to send cost analysis metrics to the Prometheus server. For multiple EKS clusters, we can enable the remote Prometheus server, either AWS-Managed Prometheus server or self-managed Prometheus. To get cost analysis metrics from multiple clusters, we need to run Kubeost with an additional Sigv4 pod that sends individual and combined cluster metrics to a common Prometheus cluster. You can follow the AWS documentation for Multi-Cluster Cost Analysis using Kubecost and Prometheus.

Cost Optimization Strategies:

Right Sizing: On the cost optimization pillar of the AWS Well-Architected Framework, we find a section on Cost-Effective Resources, which describes Right Sizing as:

“… using the lowest cost resource that still meets the technical specifications of a specific workload.”

Application Right Sizing: Right-sizing is the strategy to optimize pod resources by allocating the appropriate CPU and memory resources to pods. Care must be taken to try to set requests that align as close as possible to the actual utilization of these resources. If the value is too low, then the containers may experience throttling of the resources and impact the performance. However, if the value is too high, then there is waste, since those unused resources remain reserved for that single container. When actual utilization is lower than the requested value, the difference is called slack cost. A tool like kube-resource-report is valuable for visualizing the slack cost and right-sizing the requests for the containers in a pod. Installation instructions demonstrate how to install via an included helm chart.

helm upgrade --install kube-resource-report chart/kube-resource-report

You can also consider tools like VPA recommender with Goldilocks to get an insight into your pod resource consumption and utilization.
Compute Right Sizing: Application right sizing and Kubecost analysis are required to right-size EKS Compute. Here are several strategies for computing right sizing:some text
- Mixed Instance Auto Scaling group: Employ a mixed instance policy to create a diversified pool of instances within your auto scaling group. This mix can include both spot and on-demand instances. However, it's advisable not to mix instances of different sizes within the same Node group.
- Node Groups, Taints, and Tolerations: Utilize separate Node Groups with varying instance sizes for different application requirements. For example, use distinct node groups for GPU-intensive and CPU-intensive applications. Use taints and tolerations to ensure applications are deployed on the appropriate node group.
- Graviton Instances: Explore the adoption of Graviton Instances, which offer up to 40% better price performance compared to traditional instances. Consider migrating to Graviton Instances to optimize costs and enhance application performance.

Purchase Options: Another part of the cost optimization pillar of the AWS Well-Architected Framework that we can apply comes from the Purchasing Options section, which says:

“Spot Instances allow you to use spare compute capacity at a significantly lower cost than On-Demand EC2 instances (up to 90%).”

Auto Scaling: The cost optimization pillar of the AWS Well-Architected Framework includes a section on Matching Supply and Demand, which recommends the following:

Cluster Autoscaling: Therefore, a prerequisite to cost optimization on a Kubernetes cluster is to ensure you have Cluster Autoscaler running. This tool performs two critical functions in the cluster. First, it will monitor the cluster for pods that are unable to run due to insufficient resources. Whenever this occurs, the Cluster Autoscaler will update the Amazon EC2 Auto Scaling group to increase the desired count, resulting in additional nodes in the cluster. Additionally, the Cluster Autoscaler will detect nodes that have been underutilized and reschedule pods onto other nodes. Cluster Autoscaler will then decrease the desired count for the Auto Scaling group to scale in the number of nodes.

‍The Amazon EKS User Guide has a great section on the configuration of the Cluster Autoscaler. There are a couple of things to pay attention to when configuring the Cluster Autoscaler:

CODE: https://gist.github.com/velotiotech/96cf3871358781aab8f3d19befc24876.js

Auto-Discovery Setup

k8s.io/cluster-autoscaler/enabled,
k8s.io/cluster-autoscaler/<cluster-name>

When using multiple Auto Scaling groups, be sure to include the following argument in the pod specification for Cluster Autoscaler:

--balance-similar-node-groups=true

Pod Autoscaling: Now that Cluster Autoscaler is running in the cluster, you can be confident that the instance hours will align closely with the demand from pods within the cluster. Next up is to use Horizontal Pod Autoscaler (HPA) to scale out or in the number of pods for a deployment based on specific metrics for the pods to optimize pod hours and further optimize our instance hours.

CODE: https://gist.github.com/velotiotech/c1efc311e8073d2fabea926f596c8b15.js

‍

Down Scaling: In addition to demand-based automatic scaling, the Matching Supply and Demand section of the AWS Well-Architected Framework cost optimization pillar includes a section, which recommends the following:

“Systems can be scheduled to scale out or in at defined times, such as the start of business hours, thus ensuring that resources are available when users arrive.”

There are many deployments that only need to be available during business hours. A tool named kube-downscaler can be deployed to the cluster to scale in and out the deployments based on time of day.

Some example use case of kube-downscaler is:

Deploy the downscaler to a test (non-prod) cluster with a default uptime or downtime time range to scale down all deployments during the night and weekend.
Deploy the downscaler to a production cluster without any default uptime/downtime setting and scale down specific deployments by setting the downscaler/uptime (or downscaler/downtime) annotation. This might be useful for internal tooling front ends, which are only needed during work time.
AWS Fargate with EKS: You can run Kubernetes without managing clusters of K8s servers with AWS Fargate, a serverless compute service.

Conclusion:

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

Strategies for Cost Optimization Across Amazon EKS Clusters

Saurabh Taneja

Cost Analysis:

Cost Optimization Strategies:

Auto-Discovery Setup

Conclusion:

MORE POSTS BY THIS AUTHOR

Saurabh Taneja

You may also like

🐉 Taming the OpenStack Beast – A Fun & Easy Guide!

Shruti Anekar

Linux Internals of Kubernetes Networking

Shiwam Jaiswal

Mastering Prow: A Guide to Developing Your Own Plugin for Kubernetes CI/CD Workflow

Bhavya Jain

Strategies for Cost Optimization Across Amazon EKS Clusters

Cost Analysis:

Cost Optimization Strategies:

Auto-Discovery Setup

Conclusion:

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

About Velotio

Subscribe to get the latest technology updates

Related Posts

Services

By Company Stage

By Engagement Model

Expertise

Product Engineering

Data and AI

Cloud & DevOps

Strategy and Consulting

Subscribe to get the latest technology updates

Strategies for Cost Optimization Across Amazon EKS Clusters

Saurabh Taneja

Cost Analysis:

Cost Optimization Strategies:

Auto-Discovery Setup

Conclusion:

MORE POSTS BY THIS AUTHOR

Saurabh Taneja

You may also like

🐉 Taming the OpenStack Beast – A Fun & Easy Guide!

Shruti Anekar

Linux Internals of Kubernetes Networking

Shiwam Jaiswal

Mastering Prow: A Guide to Developing Your Own Plugin for Kubernetes CI/CD Workflow

Bhavya Jain

Strategies for Cost Optimization Across Amazon EKS Clusters

Cost Analysis:

Cost Optimization Strategies:

Auto-Discovery Setup

Conclusion:

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

About Velotio

Subscribe to get the latest technology updates

Related Posts

🐉 Taming the OpenStack Beast – A Fun & Easy Guide!

Linux Internals of Kubernetes Networking

Mastering Prow: A Guide to Developing Your Own Plugin for Kubernetes CI/CD Workflow

Simplifying MySQL Sharding with ProxySQL: A Step-by-Step Guide

Streamline Kubernetes Storage Upgrades

Unlocking Key Insights in NATS Development: My Journey from Novice to Expert - Part 1

Unveiling the Magic of Kubernetes: Exploring Pod Priority, Priority Classes, and Pod Preemption

How to deploy GitHub Actions Self-Hosted Runners on Kubernetes

How to Setup HashiCorp Vault HA Cluster with Integrated Storage (Raft)

How To Get Started With Logging On Kubernetes?

Create CI/CD Pipeline in GitLab in under 10 mins

Acquiring Temporary AWS Credentials with Browser Navigated Authentication

How to Avoid Screwing Up CI/CD: Best Practices for DevOps Team

How to Make Your Terminal More Productive with Z-Shell (ZSH)

Setting Up A Robust Authentication Environment For OpenSSH Using QR Code PAM

Hacking Your Way Around AWS IAM Roles

Monitoring a Docker Container with Elasticsearch, Kibana, and Metricbeat

Autoscaling in Kubernetes using HPA and VPA

Managing a TLS Certificate for Kubernetes Admission Webhook

Prow + Kubernetes - A Perfect Combination To Execute CI/CD At Scale

Building A Containerized Microservice in Golang: A Step-by-step Guide

Kubernetes Migration: How To Move Data Freely Across Clusters

OPA On Kubernetes: An Introduction For Beginners

To Go Serverless Or Not Is The Question

Ensure Continuous Delivery On Kubernetes With GitOps’ Argo CD

How To Implement Chaos Engineering For Microservices Using Istio

Helm 3: A More Secured and Simpler Kubernetes Package Manager

An Introduction To Cloudflare Workers And Cloudflare KV store

Getting Started With Kubernetes Operators (Golang Based) - Part 3

Getting Started With Kubernetes Operators (Ansible Based) - Part 2

Getting Started With Kubernetes Operators (Helm Based) - Part 1

How to Write Jenkinsfile for Angular and .Net Based Applications

Kubernetes CSI in Action: Explained with Features and Use Cases

A Comprehensive Tutorial to Implementing OpenTracing With Jaeger

The Ultimate Guide to Disaster Recovery for Your Kubernetes Clusters

Know Everything About Spinnaker & How to Deploy Using Kubernetes Engine

Mesosphere DC/OS Masterclass : Tips and Tricks to Make Life Easier

Managing Secrets Using AWS Systems Manager Parameter Store and IAM Roles

Taking Amazon's Elastic Kubernetes Service for a Spin

Extending Kubernetes APIs with Custom Resource Definitions (CRDs)

Jenkins X - A Cloud-native Approach to CI/CD

Demystifying High Availability in Kubernetes Using Kubeadm

Exploring Upgrade Strategies for Stateful Sets in Kubernetes

Learn How to Quickly Setup Istio Using GKE and its Applications

Continuous Deployment with Azure Kubernetes Service, Azure Container Registry & Jenkins

Tutorial: Developing Complex Plugins for Jenkins

A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

Elasticsearch 101: Fundamentals & Core Components

A Primer on HTTP Load Balancing in Kubernetes using Ingress on Google Cloud Platform

Container Security: Let's Secure Your Enterprise Container Infrastructure!

Flannel: A Network Fabric for Containers

Cloud Native Applications — The Why, The What & The How

Serverless Computing: Predictions for 2017