Blot’s EC2 instance becomes unresponsive. Doesn’t respond to requests, ssh connections or even system reboots. I want to get the server responsive again when it’s down.
Instead of a system level reboot — I manually stop and then manually start the instance.
Cloudwatch offers an action to reboot an instance when an alarm is going. However, it does not seem to offer a way to stop-then-start an instance when an alarm is going.
So, I ended up delivering a message to SNS when a cloudwatch alarm is going off. I then subscribe to this message queue from a lambda function which itself stops an instance, then starts it again.
CloudWatch (ec2 instance monitoring) -> SNS (simple notiﬁcation service) -> Lambda (serverless function invocation)