Amazon S3 Service Disruption
Learn about the Amazon S3 service disruption and the possible failure mitigation techniques.
We'll cover the following...
Introduction
Amazon Simple Storage Service (S3) is one of the services AWS offers. S3 is a highly secure, scalable, and durable object storage service that provides data storage and retrieval from anywhere.
On February 28, 2017, S3 started to fail in the Northern Virginia (US-EAST-1) region due to a human error. This service disruption lasted
In this lesson, we discuss the root cause of the S3 failure and how to mitigate such failures.
How did it happen?
The root cause of the S3 outage was a human error made during a routine debugging process. Let's look at how this happened:
An Amazon S3 team member attempted to troubleshoot an issue with the billing system. The intention was to remove one of the S3 subsystems used by the billing system.
An incorrect command was entered, which caused a significant removal of S3 servers. ...