This happens all the time and it's a pretty simple answer,: the engineering chops of the team may not be up to snuff. Too many old school sysadmins were handed "Cloud/DevOps/SRE" titles and didn't update their skills, it's why a lot of top companies expect their SWEs to manage resources. Hate to say but if your team isn't capable of scaling things using all the additional options cloud providers (and open source tools) give you, many at no additional cost on top of compute/data transfer, to scale your operations to optimize for cost, and also have the ability to build things in a way that avoid vendor lock-in (IaC/Terraform/containerization, along with having someone who actually understands Cloud Architecture), then you may need better engineers. 9/10 chance your team "migrated everything to cloud" as a 1:1 match with what you were running in a DC and then went shocked_pikachu when it was more money. Additionally, have y'all factored in all the time/money spent on maintaining the server hardware, power, DC cooling, etc. too? Cloud providers just plain have better engineers than any average company, especially ones doing the whole "this is more!" dance post cloud migration.
You can absolutely do the whole 1:1 migration to cloud, but always expect things to balloon at least a bit post-migration, but then immediately work on learning all the tools these providers give you to tighten down your cloud spend. How much are you spending on disk? Would bucket storage be cheaper? RE: Containers, even if you DO go that route, do you really need Kubernetes, which will come at an additional monetary and also maintenance cost? The likely answer at least initially is a big fat "no". Are you running every VM, even lower envs, 24/7? Is that required? If your services are not stateless, work to make them such so you can learn about scaling in the cloud, which can even be done w/ VM-based services.
I'm not even going to touch on how much more agility using cloud vs a DC gives you.
This all may sound a bit aggressive, but it's not meant to be. It's just, when you've seen this same complaint many times, ya know. This is a learning opportunity to figure out so much about how to build your environment using relevant cloud services.
I agree, that good cloud engineers can save costs in the cloud. But I also think good non-cloud engineers, can save much much more.
When you are rewriting your entire stack to leverage cloud performance, you could probably spend a similar effort for a rewrite that increases regular performance by a similar factor.
RE: Containers, even if you DO go that route...
I was under the impression, that stateless stuff without containers requires a strong vendor login (aws lambda, google functions, azure function). Are you saying, I could do stateless without vendor-lockin and without containers and without kubernetes? This is news to me. Please point me to some resources
These days there are many solutions to deploy kubernetes on a fleet of bare-metal servers, so if you use kubernetes, the option to take everything in house again is available. Distributed storage are the toughest one to setup in house but there are many mature solutions that integrate with kubernetes well these days.
is stateless possible without kubernetes? (and without vendor lock in?)
GP said:
RE: Containers, even if you DO go that route, do you really need Kubernetes, which will come at an additional monetary and also maintenance cost? The likely answer at least initially is a big fat “no”.
There are self-hosted runtime such as workerd that allows you to run your own stateless lambda-like platform. It's kinda losing steam these days though, and everyone seems to be pushing self-hosted kubernetes as the best way to get off the cloud these days.
Skill, but mostly due to the company not investing in the time to train to do it right. The company just wants to start next week by saying "flip the cloud switch" and immediately see their costs go down, without any outages and without putting in due diligence.
And sometimes the CEO/CIO/manger is too busy to coordinate training because the decision maker is busy on their "cloud provider training" for only them, in a Swiss Alps super swanky spa and resort.
If your services are not stateless, work to make them such so you can learn about scaling in the cloud, which can even be done w/ VM-based services.
how much more agility using cloud vs a DC gives you
This can't be understated. Embracing elastic idology to remove single points of failure and decoupling stateful aspects of applications has been the biggest takeaway of being part of several migrations of services to AWS. Implementing these into your practices as you grow is a huge benefit that may is worth the cost.
Over time, if the scale you're operating at grows, using experience/knowledge from AWS and applying it to running services in a datacenter could be beneficial. In my experience, if you have a large, consistent, asynchronous workload which you've maxed out on reserved instances or savings plans, it is likely cheaper to operate on your own hardware than in the cloud (or get credits from GCP or Azure to migrate services to reduce costs). This is where avoiding vendor lock-in is key.
have y’all factored in all the time/money spent on maintaining the server hardware, power, DC cooling, etc. too?
For sure, this isn't 2007 where you need to purchase servers and network equipment to start a website. For most startups and small businesses, operating in the cloud will be less expensive upfront and likely over the first 3 years. This isn't a one size fits all approach though, and it'd be prudent to evaluate the cloud spend periodically and compare with what'd it'd cost to manage it entirely. Obviously you'd need a team competent enough to manage this, without it going to shit.