Two fundamental cloud practices every technical leader should implement

Author: Matan Bordo

Technical Account Managers (TAMs) at DoiT, mirroring their hyperscaler counterparts, act as advisors, guiding customers through both technical and strategic aspects of their cloud journey.

They help customers with everything from 

  • Cost optimization 
  • Re-architecting for high availability and performance
  • Guiding them through their next cloud commitment

Moreover, TAMs do this across a diverse range of customers in different industries, giving them a front-row seat to new best practices, edge cases, and more.

Undoubtedly, this gives TAMs a wealth of interesting experiences that can be distilled into useful lessons any technical leader can implement when navigating their own cloud journey.

That’s why we invited a few of our TAMs onto our Cloud Masters podcast to share some interesting stories.

In this blog, we’ll using the stories shared by some of the TAMs on the podcast episode, Ieva Jonaityte and Eric Ethridge to illustrate why it’s crucial to:

  1. Regularly review architecture decisions
  2. Align your technical decisions with business requirements

Regularly review your architecture decisions: A case study with EKS

While there are many reasons why you should regularly review architecture decisions, this story illustrates why you should review architecture decisions made in the past that were made due to historical cloud vendor constraints and limits. For example, new, useful features released for services you use, that didn’t exist when you originally set things up.

To illustrate this, Ieva shared a real example of one of her AWS customers saving over $200,000 per month by enabling Topology Aware Hints on EKS — a feature released after they originally set up their architecture.

Initially, the company deployed their EKS across five availability zones within a region to leverage as many spot instances as possible for cost savings and for high availability. However, as a result of this, their inter-AZ data charges surged unexpectedly. 

Activating topology aware routing hints meant that endpoints were being filtered based on AZ information, ensuring most of the traffic remains within the same zone. By aligning traffic based on availability zone information, they drastically reduced their costs. 

Now, what to make of all this? Make it a part of your mindset to continuously reassess your architecture and keep up with new features and best practices for the services you pay the most for. Cloud services evolve rapidly, and what worked a year ago might need reevaluation today. Or there may be a new feature released that can help you do something you couldn’t when you originally set up your architecture.

Making sure technical decisions match business requirements

It can be easy to fall in the trap of over-engineering without considering whether your technical decisions align with actual business needs.

Take Eric’s story for example. 

His customer builds an imaging solution for optometrists, and initially opted to use Google Cloud’s Persistent Disk feature due to its speed, growing capacity, and backup capabilities. Their goal originally was to enable optometrists to quickly retrieve patient images, ideally reducing retrieval time from three seconds to one second.

But think for a moment about the average visit to an optometrist: 

Patients may get their eyes imaged, and these images are uploaded to the cloud storage for future reference. The retrieval speed between two seconds and ten seconds typically doesn’t make a significant difference to the patient, or clinical outcomes. More often than not, these images are never accessed again once stored, except maybe for compliance reasons.

Going with Persistent Disk led to the customer amassing PBs of images on PD, substantially increasing their storage costs without any additional benefit.

A closer examination revealed a fundamental misalignment between the technical choices made and the actual business requirements. The reality in this case was that the speed of image retrieval—whether it took ten seconds or two seconds—had little impact on the customer experience or the optometrist’s workflow.

By questioning the necessity of such speed and this storage type — simply asking “Do you really need this *that* fast?” — it became clear that storing these rarely accessed images on Persistent Disk was not necessary. They subsequently decided to use Google Cloud Autoclass storage, which automatically moves infrequently accessed data to cheaper storage tiers, ultimately leading to a 40% reduction in storage costs when it was all said and done.

Closing thoughts

Cloud technology evolves rapidly. What was once the best solution may no longer be optimal.

All the things you need to do to run your cloud over time will need to evolve — whether you’re scaling, building new products with new services, rearchitecting, etc.

Oftentimes the required change is unanticipated – how many technical leaders do you think were thinking about how to implement LLMs in their product 1 or 2 years ago?

The experiences shared by DoiT’s Technical Account Managers highlight two crucial aspects of effective cloud management in mitigating risk that unanticipated change brings: the importance of regularly reassessing your architecture and aligning technical decisions with business needs. 

***

Hot off the success of CTO Craft Con: London, and the triumphant debut of our CTO Craft Mixers in the heart of Germany’s capital, we’re excited to bring the complete CTO Craft Con experience to Berlin.

Join us in September as we gather CTOs and emerging tech leaders from startups, scaleups, soonicorns, unicorns, big tech, and beyond to address the most pressing challenges of our industry in a secure, collaborative, and inclusive peer-to-peer setting.

Find out more here: https://conference.ctocraft.com/berlin-2024/

Join now to become a member of the free CTO Craft Community, where you’ll get exclusive access to Slack channels, conference insights and other valuable content. Subscribe to Tech Manager Weekly for a free weekly dose of tech culture, hiring, development, process and more.