EC2 - Connect to running instance by using the API - amazon-ec2

I create an EC2 instance via the provided interface, and I am using the AWS API to connect to the existing running instance, but when I run the following code I get "You have 0 Amazon EC2 instance(s) running.":
DescribeAvailabilityZonesResult availabilityZonesResult = ec2.describeAvailabilityZones();
System.out.println("You have access to " + availabilityZonesResult.getAvailabilityZones().size() +
" Availability Zones.");
DescribeInstancesResult describeInstancesRequest = ec2.describeInstances();
List<Reservation> reservations = describeInstancesRequest.getReservations();
Set<Instance> instances = new HashSet<Instance>();
for (Reservation reservation : reservations) {
instances.addAll(reservation.getInstances());
}
System.out.println("You have " + instances.size() + " Amazon EC2 instance(s) running.");
Do you have any ideas about what might be the problem?

If you double checked that your instances are actually up and running, they most likely are not in the "us-east-1" instance region (which is the default one that the AWS SDK assumes).
So set your AmazonEC2Client instance to point to the correct endpoint and everything should be fine, e.g. for Europe (Ireland):
ec2.setEndpoint("ec2.eu-west-1.amazonaws.com");
More details, as well as links to where you can find the endpoint strings, in this SO answer.

Related

Executing Powershell script on remote Windows EC2 instance in Terraform

I am starting a Windows EC2 instance in AWS. Now I want to install certain software like OpenSSH and some other tasks like creating user after the server has been created. If I have a PowerShell script, how do I execute on the remote instance?
I have a local PowerShell script - install_sft.ps1 and I want to execute on the remote EC2 instance in AWS.
I know I need to use a "provisioner" but unable to get my head around how to use it for Windows.
resource "aws_instance" "win-master" {
provider = aws.lmedba-dc
ami = data.aws_ssm_parameter.WindowsAmi.value
instance_type = var.instance-type
key_name = "RPNVirginia"
associate_public_ip_address = true
vpc_security_group_ids = [aws_security_group.windows-sg.id]
subnet_id = aws_subnet.dc1.id
tags = {
Name = "Win server"
}
depends_on = [aws_main_route_table_association.set-master-default-rt-assoc]
}
You can do this by making use of the user_data parameter of the aws_instance resource:
resource "aws_instance" "win-master" {
...
user_data_base64 = "${base64encode(file(install_sft.ps1))}"
...
}
Just ensure that install_sft.ps1 is in the same directory as your Terraform code.
An EC2 instance's User Data script executes when it starts up for the first time. See the AWS documentation here for more details.

Unable to Launch EC2 Instances Asynchronously via Terraform

I am willing to launch two instances via Terraform. First one will generate some certificate files, push to S3 bucket. The second instance will pull those certificates from particular S3 bucket. Both operations will be handled by user data. The problem here is pull commands (aws cli) in user data of second instance are not working. (It is working when I try from shell) I think the issue is about terraform is launching both instances synchronously so that second instance is getting launched before first instance pushes the certificates to S3.
I also tried to handle this by adding "depends_on" to my code but it did not work. I am looking for a way to launch the instances asynchronously. Like second instance will be launched after 30 seconds then first instance is launched. Here I am pasting the related part of the code.
data "template_file" "first_executor" {
template = file("some_path/first_executor.sh")
}
resource "aws_instance" "first_instance" {
ami = data.aws_ami.amazon-linux-2.id
instance_type = "t2.micro"
user_data = data.template_file.first_executor.rendered
network_interface {
device_index = 0
network_interface_id = aws_network_interface.first_instance-network-interface.id
}
}
###
data "template_file" "second_executor" {
template = file("some_path/second_executor.sh")
}
resource "aws_instance" "second_instance" {
depends_on = [aws_instance.first_instance]
ami = data.aws_ami.amazon-linux-2.id
instance_type = "t2.micro"
user_data = data.template_file.second_executor.rendered
network_interface {
device_index = 0
network_interface_id = aws_network_interface.second-network-interface.id
}
}
Answer is no. "depends_on" in Terraform means it will wait for a resource to be available. This means, your second EC2 will be created as soon as first EC2 is triggered.
Terraform will not wait till your first EC2 is in "running" state or if user data is executed.
I would suggest go with depdens_on and then, in your second EC2 user data script, add some logic to have a loop which will look up S3 and will wait and repeat till the resources are found.

Why is my CloudWatch alarm not being applied to the EC2 instances?

I have python code in a Lambda function to apply a CloudWatch alarm to EC2 instances.
The CloudWatch alarm is to reboot them if they are non-responsive for 10 minutes. This alarm is easy to make on a per EC2 instance basis, but that's a lot of manual work, we have many servers.
I've set up a CloudWatch rule that triggers my Lambda function when an EC2 instance enters the ''running'' state after a reboot, or after a new EC2 instance is launched and gets to ''running''.
I have tried specifying a specific server in my code, and that works. However, what I want is a piece of code that applies it to servers as they are rebooted; so at to cover them all as maintenance windows come around and they all get rebooted.
from collections import defaultdict
import boto3
ec2_sns = 'SNS-Topic:'
ec2_rec ="arn:aws:automate:eu-central-1:ec2:recover"
def lambda_handler(event, context):
ec2 = boto3.resource('ec2')
cw = boto3.client('cloudwatch')
ec2info = defaultdict()
running_instances = ec2.instances.filter(Filters=[{'Name': 'tag-
key','Values': ['cloudwatch'],}])
for instance in running_instances:
for tag in instance.tags:
if 'Name'in tag['Key']:
name = tag['Value']
ec2info[instance.id] = {'Name':
name,'InstanceId':instance.instance_id,}
attributes = ['Name','InstanceId']
for instance_id, instance in ec2info.items():
instanceid =instance["InstanceId"]
nameinsta = instance["Name"]
print(instanceid,nameinsta )
#Create StatusCheckFailed Alamrs
cw.put_metric_alarm(
AlarmName = ('InstanceId') +
"_System_Unresponsive_(Created by Lambda)",
AlarmDescription='System_unresponsive for 10
minutes',
ActionsEnabled=True,
OKActions=[
'No data',
],
AlarmActions=[
'arn:aws:lambda:eu-central
1:788677770941:function:System_unresponsive:reboot',
],
InsufficientDataActions=[
'Insuficient data',
],
MetricName='StatusCheckFailed',
Namespace='AWS/EC2',
Statistic='Average',
Dimensions=[ {'Name': "InstanceId",'Value':
instanceid},],
Period=300,
Unit='Seconds',
EvaluationPeriods=2,
DatapointsToAlarm=2,
Threshold=1,
ComparisonOperator='LessThanOrEqualToThreshold')
I expect that the code applies the specified CloudWatch alarm to servers as they are rebooted, but it doesn't.
When you test it all you get is ''null'' as a result.
You can use CloudTrail to get insights into the API calls that AWS is doing to start the instances and catch just those specific events with CloudWatch Events.
Once you catch the right events and send them to a lambda, the lambda will receive the instance ID in the event information. You could use that information to create/update the alarms just for the instance contained in the event. You could use print(json.dumps(event)) inside the function to inspect the event contents in CloudWatch Logs.

glue job times out when calling aws boto3 client api

I am using glue console not dev endpoint. The glue job is able to access glue catalogue and table using below code
datasource0 = glueContext.create_dynamic_frame.from_catalog(database =
"glue-db", table_name = "countries")
print "Table Schema:", datasource0.schema()
print "datasource0", datasource0.show()
Now I want to get the metadata for all tables from the glue data base glue-db.
I could not find a function in awsglue.context api, therefore i am using boto3.
client = boto3.client('glue', 'eu-central-1')
responseGetDatabases = client.get_databases()
databaseList = responseGetDatabases['DatabaseList']
for databaseDict in databaseList:
databaseName = databaseDict['Name']
print ("databaseName:{}".format(databaseName))
responseGetTables = client.get_tables( DatabaseName = databaseName,
MaxResults=123)
print("responseGetDatabases{}".format(responseGetTables))
tableList = responseGetTables['TableList']
print("response Object{0}".format(responseGetTables))
for tableDict in tableList:
tableName = tableDict['Name']
print("-- tableName:{}".format(tableName))
the code runs in lambda function, but fails within glue etl job with following error
botocore.vendored.requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='glue.eu-central-1.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to glue.eu-central-1.amazonaws.com timed out. (connect timeout=60)'))
The problem seems to be in environment configuration. Glue VPC has two subnets
private subnet: with s3 endpoint for glue, allows inbound traffic from the RDS security group. It has
public subnet: in glue vpc with nat gateway. Private subnet is reachable through gate nat Gateway. I am not sure what i am missing here.
Try using a proxy while creating the boto3 client:
from pyhocon import ConfigFactory
service_name = 'glue'
default = ConfigFactory.parse_file('glue-default.conf')
override = ConfigFactory.parse_file('glue-override.conf')
host = override.get('proxy.host', default.get('proxy.host'))
port = override.get('proxy.port', default.get('proxy.port'))
config = Config()
if host and port:
config.proxies = {'https': '{}:{}'.format(host, port)}
client = boto3.Session(region_name=region).client(service_name=service_name, config=config)
glue-default.conf and glue-override.conf are deployed to the cluster by glue while spark submit into the /tmp directory.
I had a similar issue and I did the same by using the public library from glue:
s3://aws-glue-assets-eu-central-1/scripts/lib/utils.py
can you please try the boto client creation as below by specifying the region explicitly?
client = boto3.client('glue',region_name='eu-central-1')
I had a similar problem when I was running this command from Glue Python Shell.
So I created endpoint (VPC->Endpoints) for Glue service (service name: "com.amazonaws.eu-west-1.glue"), this one was assigned to the same Subnet and Security Group as the Glue Connection which was used in the Glue Python Shell Job.

AWS - get autoscaling names for use with Capistrano

I am trying to setup a remote deployment with Capistrano on the Amazon Cloud.
The idea : I SSH to a random machine of the autoscaling group and I want to deploy to all the other machines from there. In order to do that I need to get the names of the other instances so I can define the Capistrano servers I want to deploy to
I have installed the Ruby sdk but I cannot figure out the best way to retrieve the instances names (taking advantage that I am on the VPN).
I have actually two possibilities : either find the instances by tags (I have tagged them with "production") or by the ID of the autoscaling group.
I don't want to use other "big guns" like Chef, etc.
After reading too much documentation
Two strategies : retrieve the dns names by autoscaling group OR by tags
By Tags
ec2 = Aws::EC2::Client.new
instances_tagged = ec2.describe_instances(
dry_run: false,
filters: [
{
name: 'tag:environment',
values: ['production'],
},
{
name: 'tag:stack',
values: ['rails'],
}
],
)
dns_tagged = instances_tagged.reservations[0].instances.map(&:private_dns_name)
By Autoscaling group
as = Aws::AutoScaling::Client.new
instances_of_as = as.describe_auto_scaling_groups(
auto_scaling_group_names: ['Autoscaling-Group-Name'],
max_records: 1,
).auto_scaling_groups[0].instances
if instances_of_as.empty?
autoscaling_dns = []
else
instances_ids = instances_of_as.map(&:instance_id)
autoscaling_dns = instance_ids.map do |instance_id|
ec2.instances[instance_id].private_dns_name
end
end

Resources