It happened again.
Amazon Web Services (AWS) went down yesterday for hours, bringing down a huge chunk of the internet with it. I didn’t realise at the time that AWS was the reason that I was not able to play the latest episode of Supergirl on my Apple TV.
And it was not just the iCloud that was affected. It was not just the small sites. Big players were hit big time, including Apple, Adobe, Docker's Registry Hub, GitHub, GitLab, Quora, Medium, Signal, Slack, Imgur, Twitch.tv...and many more.
This is not the first time AWS has gone down for hours, bringing everyone down with them. And it won’t be the last time.
This time the reason for failure was problems with Amazon’s storage solution S3 in the eastern region, US-EAST-1. Some experts said that it could have been avoided if customers distributed their services across multiple regions.
“While this does impact an estimated 20 percent of the internet, there are many businesses hosted on Amazon that are not having these issues,” Shawn Moore, CTO at Solodev said in a statement.
Moore said those customers who fully embraced Amazon’s design philosophy to have their website data distributed across multiple regions were prepared.
“This is a wake-up call for those hosted on AWS and other providers to take a deeper look at how their infrastructure is set up and emphasizes the need for redundancy – a capability that AWS offers, but it’s now being revealed how few were actually using.”
It’s less worrisome that AWS went down and took 20 percent of the internet with it. What’s really bothering me is how many huge companies are solely relying on AWS for their services.
What makes the internet so resilient to any failure is that there is no single vendor, authority or infrastructure controlling it. What makes the internet nuke proof is that that it's fully distributed.
But relying solely on AWS runs counter to the idea of the ‘distributed’ internet as AWS will become the internet. That creates a single point of total failure, a single point of control. It tears the very fabric of our 'distributed' internet as we are putting all of our eggs in one single basket.
AWS poses a very serious threat to that resilience. It poses an existential threat to the internet and its no fault of Amazon's.
Goodbye internet, hello Amazon’s intranet.
Multi-cloud, not multi-region, is the answer
While I fully agree with Moore that AWS customers should distribute their services across regions, I think it’s about time companies start looking at a multi-cloud strategy. I think it’s about time we start taking some eggs from the AWS basket and putting them in different baskets.
“Today's S3 crash will inevitably cost businesses millions of dollars," Chip Childers, CTO of the Cloud Foundry Foundation said in a statement.
"This is why all businesses need a multi-cloud strategy so they can adapt immediately when, inevitably, one of their cloud vendors experiences a failure. It’s not Amazon’s fault, it’s inevitable. #cloudfoundry keeps your cloud options open."
But what’s this multi-cloud strategy? In layman’s terms, a multi-cloud strategy means using more than one cloud or infrastructure to run your services. It increases redundancy, eliminates any vendor lock-in and allows you to use best-of-breed solutions instead of using everything from the same vendor.
You can use IaaS from one vendor, PaaS from another vendor or mix and match. There are already a lot of open source solutions out there that enable customers to adopt a multi-cloud strategy.
The bottom line: stop putting all your eggs in the AWS basket. Go multi-cloud.
This article originally appeared on CIO.com