How to restart an ec2 server when CloudWatch Synthetics Canary fails? - amazon-ec2

I have a site I'm working on, and one of the pages retrieves data from another server, let's call it server B.
Occasionally server B fails to return data, and the main site will give a 500 error.
I want to restart server B when that happens, and I was thinking I could use CW synthetics to do that. I've created a CW alarm to trigger, but I don't have a direct way to restart an ec2 server, since it's not associated directly with one.
I've thought of calling a lambda that will restart the server, but I'm wondering if there's a simpler configuration/solution I can use.
Thanks

You can creat an Event Bridge rule for failed canary run by selecting Synthetics Canary TestRun Failure from AWS events then in Event pattern -> AWS service select Cloudwatch Syntheticsand in Event type select Synthetics Canary TestRun Failure. From the Target select AWS service -> Select a target select EC2 Rebootinstances API call and give the instance id.
UPDATED:
You can use custom patterns and pass your json which can match the failure pattern.
In your case I would use something like,
{
"source": ["aws.synthetics"],
"detail-type": ["Synthetics Canary TestRun Failure"],
"region": ["us-east-1"],
"detail": {
"account-id": ["123456789012"],
"canary-id": ["EXAMPLE-dc5a-4f5f-96d1-989b75a94226"],
"canary-name": ["events-bb-1"]
}
}

Create an Event Bridge rule for a failed canary run, that triggers a Lambda function. Have the Lambda function restart the EC2 server via the AWS API/SDK.

Related

GCP - creating a VM instance and extracting logs

I jave a JAVA application in which I am using GCP to create VM instances from images.
In this application, I would like to allow the user to view the vm creation logs in order to be updated on the status of the creation, and to be able to see failure points in detail.
I am sure such logs exist in GCP, but have been unable to find specific APIOs which let me see a specific action, for example creation of instance "X".
Thanks for the help
When you create a VM, the answer that you have is a JobID (because the creation take time and the Compute Engine API answer immediately). To know the status of the VM start (and creation) you have to poll regularly this JobID.
In the logs, you can also filter with this JobID to select and view only the logs that you want on the Compute API side (create/start errors).
If you want to see the logs of the VM, filter the logs not with the JobID but with the name of the VM, and its zone.
In Java, you have client libraries that help you to achieve this

AWS SSM describe_instance_information using old data?

I have a lambda function that is running:
ssm = boto3.client('ssm')
print(ssm.describe_instance_information())
It returns 6 instances. 5 are old instances that have been terminated and no longer show up in my console anymore. One instance is correct. I created an AMI image of that instance and tried launching several instances under the same security group and subnet. None of those instances are returned from describe_instance_information. Is it reporting old data?9
My end goal is to have the lambda function launch an instance using the AMI and send a command to it. Everything works if I use the existing instance. I am trying to get it to work with one created from the AMI.
EDIT:
After a while, the instances did show up, I guess it takes a while. I dont understand why terminated instances still show up. I can poll describe_instance_information until the instance_id I want shows up but is there a cleaner built-in function, like wait_for_xxxxx()?
You can use filter parameter with PingStatus which determine Connection status of SSM Agent.
response = client.describe_instance_information(
Filters=[
{
'Key': 'PingStatus',
'Values': [
'Online'
]
},
]
)

How to verify that AWS lambda function is running on raspberry pi 3 for Greengrass?

I am preferring official AWS doc for AWS Greengrass setup in RaspberryPi3. I have already completed
Module 1: Environment Setup for Greengrass
Module 2: Installing the AWS IoT Greengrass Core Software
When it comes to
Module 3 (Part 1): Lambda Functions on AWS IoT Greengrass
, I got stucked in "Verify the Lambda Function Is Running on the Core Device".
Because I can't see "hello world! Sent from greengrass core running on plateform: Linux - 4.19.86-v7+-armv7l-with-debian9.0" at MQTT client dashboard by subscribing to the topic "hello/world".
I have already deployed such deployment successfully for my greengrass group and provided subscriptions and Lambda functions as explained in AWS docs. I have also started Daemon on RaspberryPi3 by the command
sudo ./greengrassd start
at path location
/greengrass/ggc/core
I have also checked GGConnManager.log file present at path location
/greengrass/ggc/var/log/system
that shows such last log like,
[INFO]-MQTT server started.
But still didn't get any expected result at MQTT client dashboard.
Am I missing something ? How should I publish or subscribe to such topic for this task ?
OR Should I try any other method to verify this AWS lambda function ? Please help.
If you don't have a user directory under the log directory, then that means that your user lambda function never executed. You probably need to set the function to be a pinned lambda, see https://docs.aws.amazon.com/greengrass/latest/developerguide/config-lambda.html section 7 for how to set that.
Here are a few things to try out.
Go to AWS Console -> GGGroup -> -> Settings -> Logs (make sure you select Local Logs for User Lambdas).
If you have done the rest correct, you should see lambda logs under /greengrass/ggc/var/log/user///*.log
For the sake of testing, you may want to add some console logs to your Lambdas (on module load, not on handler invocation).
cheers,
ram

Terraform - having timing issues launching EC2 instance with instance profile

I'm using Terraform to create my AWS infrastructure.
I've a module that creates an "aws_iam_role", an "aws_iam_role_policy", and an "aws_iam_instance_profile" and then launches an EC2 Instance with that aws_iam_instance_profile.
"terraform plan" works as expected, but with "terraform apply" I consistently get this error:
* aws_instance.this: Error launching source instance: InvalidParameterValue: IAM Instance Profile "arn:aws:iam::<deleted>:instance-profile/<deleted>" has no associated IAM Roles
If I immediately rerun "terraform apply", it launches the EC2 instance with no problem. If I run a "terraform graph", it does show that the instance is dependent on the profile.
Since the second "apply" is successful, that implies that the instance_policy and all that it entails is getting created correctly, doesn't it?
I've tried adding a "depends_on" and it doesn't help, but since the graph already shows the dependency, I'm not sure that is the way to go anyway.
Anyone have this issue?
Race conditions are quite common between services - where state is only eventually consistent due to scale. This is particularly true with IAM where you will often create a role and give a service such as EC2 a trust relationship to use the role for an EC2 instance, but due to however IAM is propogated throughout AWS, the role will not be available to EC2 services for a few seconds after creation.
The solution I have used, which is not a great one but gets the job done, is to put the following provisioner on every single IAM role or policy attachment to give the change time to propagate:
resource "aws_iam_role" "some_role" {
...
provisioner "local-exec" {
command = "sleep 10"
}
In this case you may use operation timeouts. Timeouts are handled entirely by the resource type implementation in the provider, but resource types offering these features follow the convention of defining a child block called timeouts that has a nested argument named after each operation that has a configurable timeout value. Each of these arguments takes a string representation of duration, such as "60m" for 60 minutes, "10s" for ten seconds, or "2h" for two hours.
resource "aws_db_instance" "example" {
# ...
timeouts {
create = "60m"
delete = "2h"
}
}
Ref: https://www.terraform.io/docs/configuration/resources.html

EC2: Waiting until a new instance is in running state

I would like to create a new instance based on my stored AMI.
I achieve this by the following code:
RunInstancesRequest rir = new RunInstancesRequest(imageId,1, 1);
// Code for configuring the settings of the new instance
...
RunInstancesResult runResult = ec2.runInstances(rir);
However, I cannot find a wait to "block"/wait until the instance is up and running apart from Thread.currentThread().sleep(xxxx) command.
On the other hand, StartInstancesResult and TerminateInstancesResult gives you a way to have access on the state of the instances and be able to monitor any changes. But, what about the state of a completely new instance?
boto3 has:
instance.wait_until_running()
From the boto3 docs:
Waits until this Instance is running. This method calls EC2.Waiter.instance_running.wait() which polls EC2.Client.describe_instances() every 15 seconds until a successful state is reached. An error is returned after 40 failed checks.
From the AWS CLI changelog for v1.6.0:
Add a wait subcommand that allows for a command to block until an AWS
resource reaches a given state (issue 992, issue 985)
I don't see this mentioned in the documentation, but the following worked for me:
aws ec2 start-instances --instance-ids "i-XXXXXXXX"
aws ec2 wait instance-running --instance-ids "i-XXXXXXXX"
The wait instance-running line did not finish until the EC2 instance was running.
I don't use Python/boto/botocore but assume it has something similar. Check out waiter.py on Github.
Waiting for the EC2 instance to get ready is a common pattern. In the Python library boto you also solve this with sleep calls:
reservation = conn.run_instances([Instance configuration here])
instance = reservation.instances[0]
while instance.state != 'running':
print '...instance is %s' % instance.state
time.sleep(10)
instance.update()
With this mechanism you will be able to poll when your new instance will come up.
Depending on what you are trying to do (and how many servers you plan on starting), instead of polling for the instance start events, you could install on the AMI a simple program/script that runs once when the instance starts and sends out a notification to that effect, i.e. to an AWS SNS Topic.
The process that needs to know about new servers starting could then subscribe to this SNS topic, and would receive a push notifications each time a server starts.
Solves the same problem from a different angle; your mileage may vary.
Go use Boto3's wait_until_running method:
http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.Instance.wait_until_running
You can use boto3 waiters,
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#waiters
for this ex: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Waiter.InstanceRunning
Or in Java https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/
I am sure there are waiters implemented in all the AWS sdks.

Resources