AWS is the most used cloud provider and every company I consulted with, wants to reduce the cost as much as possible. In this blog, I will show some of the optimizations I did and how they impacted the cost.
To give the context, my client is running Big data workloads along with the regular web app.
Before going into the details of Optimisation, there are 2 key points to be mentioned,
Get familiar with AWS Cost Explorer. Also, make sure you tag each & every AWS resource, this helped a lot in identifying the most expensive resource.
Next is, to setup Slack integration to know the daily AWS Cost & Usage
Basic Sanity checking for the resource usage
Though I mentioned the initial cost as $45K/m, at peak time it was around $85K/m and most of those were unnecessary instances ($63K/m) running.
People are hesitant to stop these instances because they are not aware of where it is being used and what consequences it will have.
Understanding the System Architecture & being confident about resources that are in use helps a lot in saving the cost. It's that simple but very effective.
A few more basic optimizations are,
Check cloud watch monitoring for the usage pattern of the last 7 / 30 days and either reduce the size or stop the instance.
Removing all unnecessary EBS volume that is hanging there. Take a snapshot if you want.
Upgrade to the latest generation in the instance family.
Managed Services to Self Hosted Services
Just for our Dev & QA environments, we are spending this much on AWS Managed Services.
We cut down this cost by running the workload by ourselves inside Kubernetes.
MSK is replaced with Strimzi
Opensearch is replaced with Elastic Search with Persistent Volume
RDS is replaced with MySQL with Persistent Volume
But this will not come for free. We now manage all the backups, maintenance, and persistent storage for these data. Using this learning, in the future we are planning to run the Production load by ourselves.
To start with, AWS Managed services are good. But once you have the Infrastructure stabilized, running it on your own, is a cost-effective approach.
Spot Instance and Savings Plan
Try to tackle the resources that are costing a lot. In our case, it is EC2 Instances. We were running all the workload in On-Demand instances and it is a low-hanging fruit.
After stopping the unnecessary instances, EC2 cost went down from $63K to $30K and with Savings plan & Spot Instances it went down further to $9K/m
A savings plan is the commitment you give to AWS for 1–3 years on the EC2 instance usage. This gives the benefit of cutting down the cost by 60%
Spot Instance can be leveraged in cases where the workload is not critical (Big data workloads like Spark Executors, and Flink Task Managers) and is resilient. This can easily cut down the cost by 60–70%.
Use Spot Advisor to learn more about Interruption time and savings.
Ability to Shutdown & Start environment easily
We created an Admin app where the Dev & QA environment can be shut down and started with just a click.
In the future, we are planning to have scheduled downtime for these environments.
Data Transfer Cost
There are many ways in which Data Transfer costs will shoot up. In our case, we reduced these costs by redesigning the systems to communicate within a single AZ.
Eg: Keeping Spark, Flink & Scylla in the same AZ and avoiding data transfer costs.
2 key points that helped us,
Try to restrict the communications between single AZs. Inter-AZ Communication will cause a lot of Data Transfer costs and for that matter Inter-Region communication too.
Access S3 / ECR only using private Endpoints. Pulling data out of S3 using a public endpoint will incur additional costs.
We cut down 80% of the cost for Dev & QA by restricting communication within a Single AZ and for Production we are finding ways to optimize it.
Hope you find these Optimization techniques useful. If you are in the process of AWS Optimisations, let me know your techniques in the comments.
Currently, I am exploring Cost Optimisation Hub and will write about how beneficial it is in another post.
EzyInfra.dev is a DevOps and Infrastructure consulting company helping clients set up the Cloud Infrastructure (AWS, GCP), Cloud cost optimization, and manage Kubernetes-based infrastructure. If you have any requirements or want a free consultation for your Infrastructure or architecture, feel free to schedule a call here.
Share this post