A collection of characters, stories, and other elements
This is the first part in a four-part series covering practical steps to optimize cloud costs across your organization.
There is a growing trend for organizations of all sizes to get out of the business of running their own data centers and move their infrastructure and application workloads to the cloud. This typically includes a transition to software as a service (SaaS) as well as leveraging infrastructure as a service (IaaS). IaaS providers are often referenced as “Hyperscalers” and include Google Cloud Platform (GCP), Amazon AWS, and Microsoft Azure. The advantages of and motivations for adopting cloud applications and infrastructure are often to achieve cost savings through a shift from a CapEx to an OpEx financial model and the opportunity to only pay for the resources and infrastructure you need or use.
However, most organizations are finding that there is actually a higher cost than expected for running their legacy workloads in a hyperscale environment. In fact, Gartner estimates that as much as 70% of cloud costs are wasted. As a result, there has been a need to adopt best practices for optimizing cloud costs (examples: What is cloud cost optimization? 15+ best practices for 2022, Cloud Cost optimization: definition and strategies, 9 Best practices for cloud cost optimization). While these articles are helpful if you need to drive alignment around an initiative, what they miss are practical steps you can take today to start optimizing your cloud costs.
Whether you are starting your journey to the cloud, or just looking to better optimize your current cloud costs, it helps to break the problem down into two different buckets.
- Infrastructure configuration, optimization, and automation – Create awareness and visibility into cloud infrastructure usage using infrastructure as code (covered here in part 1). Then review how your infrastructure can be automated and optimized to find cost savings (covered in part 2 – coming soon).
- Application refactoring – Review application architecture and resource utilization. Often applications are monolithic and not designed to be cloud native. Hyperscalers provide compute, memory, and storage, as well as new standardized services, technologies, and approaches that were not available in the traditional data center environments. This requires insights into the portions of your code that can leverage the newly available technologies and the ability to rapidly update patterns in code across the organization. At the very least, organizations need to identify and optimize their workloads to run more efficiently in the new hyperscale environment (covered in parts 3 and 4 – coming soon).
Infrastructure as code
Infrastructure as code (IaC) is the process of managing and provisioning cloud infrastructure through machine-readable definition files like Terraform (open source repositories with examples, example files) and AWS CloudFormation (open source repositories with examples, example files), rather than physical hardware configuration or interactive configuration tools . These definition files make it possible to have visibility of all your provisioned infrastructure, especially if you have a universal code search solution inside your organization.
According to Gartner research, less than 5% of server provisioning utilized IaC in 2020, and only 40% is expected to do so by 2023. This means that the vast majority of cloud infrastructure is manually provisioned, built on a huge amount of untraceable scripts, or manually configured in the cloud provider interface.
If your organization has not adopted IaC (which is the majority – you’re not alone!), then the first step is to create visibility into your cloud infrastructure configuration by exporting it into a version control system so that you can search over the current state and see the history of changes.
Doing a one-time audit of your cloud infrastructure is beneficial for cutting costs, but if you want to continually review for cost reduction, you’ll want to have automation in place to regularly export your config into a Git repository. We’ve pulled together some helpful resources that should make this setup a bit easier.
Export all cluster config into a Git repository – use a tool like CloudFormation or Terraform (provisioning tools).
First, choose the provisioning tool you want to export to. We recommend Terraform. Here is a helpful blog series on choosing Terraform over other provisioning tools.
Terraformer is a CLI to export config to Terraform on any cloud provider.
Export all services config into a Git repository – use a tool like Chef, Puppet, Ansible, or Kubernetes (configuration management tools) to export the services configuration.
We deploy Sourcegraph.com with Kubernetes, and use kube-backup to export and backup this configuration.
Set up a CronJob to regularly snapshot the configuration files from the first two steps into the Git repos. By regularly updating, you will be able to have better traceability of changes over time, and it will allow you to better understand the entire system. Additionally, you have now moved one small step closer to infrastructure as code.
Visibility and awareness are the first steps toward greater cost savings across your organization and extend to include both direct costs of cloud infrastructure as well as operational savings. The next step is to identify targets for optimization and to monitor changes over time. We will cover approaches and practical examples of this in part 2.
Thanks to the following people for helping with this post: Mark McCauley, Rafal Gajdulewicz, and Nick Snyder.
About the author
Christina Forney is product advisor at Sourcegraph, and has 10+ years working in developer and product roles in the developer tools space. You can reach Christina by email or on Twitter @christina4nee.