Amazon Elastic Compute Cloud (EC2) is a service provided by Amazon that provides us with massive computing power. Through this service, we can create virtual machines, and using those virtual machines, we can create applications.
EC2 constantly monitors all running instances to identify any hardware or software related issues. These checks are performed every minute, and since they’re built into EC2, they can’t be deleted or disabled. There are three types of status checks performed by EC2:
System status check
Instance status check
Attached EBS status check
In the system status check, the AWS systems over which an instance runs are monitored. This check is performed on the underlying infrastructure of our EC2 instances. These issues are usually resolved by AWS. Here are some of the reasons that might cause the system status check to fail:
Loss of network connectivity: This issue is usually caused by network issues such as faulty security groups or incorrect routing table values.
Loss of system power: A system status check can fail if there’s a problem with the physical host where our EC2 instance is located. Migrating our instance to another host can easily resolve this.
Underlying software and hardware issues: In some cases, the physical host can cause this check to fail. In case this happens, our instance is automatically migrated to another host.
The following command can be used to fetch system check fails:
aws ec2 describe-instance-status --instance-ids <EC2 Instance ID> --query 'InstanceStatuses[0].SystemStatus.Status'
This check is performed on the software and network configurations of our instances. We need to resolve the problems detected by these systems ourselves, usually by rebooting or changing the instance configurations. Here are some of the reasons that might cause this status check to fail:
Failed system status check: The instance status check will automatically fail if our system status check fails.
Faulty configurations: Misconfigured network settings, such as faulty IP configurations or incorrect routing can cause the instance status check to fail.
Exhausted memory: Our instance status check can fail if our instance runs out of the memory allocated.
Corrupted file system: If our operating system files are corrupted or missing, it can lead to a failed instance status check.
Incompatible kernel: This can lead our instance status check to fail in case our kernel and AMI are not compatible. We’ll have to update our kernel or modify our instance’s configuration if this error occurs.
The following command can be used to fetch instance check fails:
aws ec2 describe-instance-status --instance-ids <EC2 Instance ID> --query 'InstanceStatuses[0].InstanceStatus.Status'
This status monitors the status of the EBS volumes attached to our instances. This status check usually fails due to hardware or software issues found in the EBS volumes. It can also fail if there are connectivity issues between the instance and the EBS volumes. The following command can be used to fetch EBS status checks:
aws ec2 describe-instance-attribute --instance-id <EC2 Instance ID> --attribute blockDeviceMapping
Note: We can also check the statuses of each check from the AWS Management Console.
What’s the primary function of Amazon EC2?
To provide a managed database service
To offer massive computing power through virtual machines
To store data on virtual disks in the cloud
To distribute content to end users with lower latency
If all of the above-mentioned status checks pass for a certain instance, its overall health is set to "OK". However, in case one or more checks fail, the overall health of the instance is changed to "Impaired". On every status check failure, the metrics in CloudWatch are incremented, which can be used to create CloudWatch alarms.
Free Resources