Ansible and AWS EC2 inventory - ansible

I'd like to get EC2 instances metadata with Ansible and do something with those instances based on the metadata. However, ec2_facts wants to SSH into instances in order to get the metadata.
I believe it should be possible to obtain the instances metadata without SSH connections.
Could you help me with that please?
Thank you.

There is information you can retrieve about instances using the aws API but ec_facts does not use it. What that Ansible module does specifically is fetch metadata via http://169.254.169.254/latest/meta-data/ which can only be done from the instance itself.
Some more information about what instance data you wish to fetch would be helpful to know. At this time there is no aws cloud module in core that will retrieve general information about an instance but Ansible makes it easy to write one.
Here is an example of a module that returns information about instances that match a set of tags - https://github.com/edx/configuration/blob/master/playbooks/library/ec2_lookup

Related

Terraform : How to fetch or destroy resources created by other means?

Sometimes I end up creating resources using AWS console due to some errors in Terraform or for lack of time. Can I list all my resources and destroy them? Basically a discovery of existing cloud resources and management of such ?
Ex: list my EC2 instances using Terraform and destroy when needed . How to achieve this?
Terraform is designed to ignore any existing objects that it didn't create because otherwise it would be risky to adopt Terraform an existing system with many existing objects and it would be impossible to decompose the infrastructure into different configurations for each subsystem without each one trying to destroy the objects being managed by the others.
Terraform doesn't have any facility for automatically detecting objects created outside of Terraform, but you can explicitly bind specific objects from your remote system to resource instances in your Terraform configuration using the terraform import command.
That command has some safeguards to try to prevent accidentally immediately deleting an object you've just imported if e.g. you make a typo of the resource instance address, and so unfortunately the design of this command is contrary to your goal: it won't let you just import something and run terraform apply to destroy it.
Instead, you'd need to:
Write a stub empty resource block for a resource of the appropriate type in your configuration.
Run terraform import to bind your existing real object to that empty resource block.
After the import succeeds, immediately remove the resource block to tell Terraform that you intend to delete the object.
Run terraform apply, and then Terraform should notice that it's tracking an object that is no longer mentioned in the configuration and propose to delete it.
Terraform is not the best tool for this job because it has essentially been designed to do the exact opposite of what you want to do, because typically users want to avoid destroying untracked objects to avoid disrupting neighboring systems.
However, you may be able to get the effect you want with some custom programming on your part, by writing a program that does something like the following:
Run terraform show -json in all of your configuration working directories to obtain a machine-readable description of the Terraform state in each one.
Decode the JSON state descriptions to find all of the resource instances of type aws_instance and collect a set of all of their id attribute values. This is the set of instances to keep.
Call the EC2 API DescribeInstances action to retrieve a list of all of the instances that actually exist. Collect a set of all of their IDs. This is the set of instances that exist.
Set-subtract the set of instances to keep from the set of instances that exist. The result is the set of instances to destroy.
If the set of instances to destroy isn't empty, call the EC2 API's TerminateInstances action to terminate every instance ID in that set.
This description is specific to Amazon EC2 instances. The same pattern could apply to objects of any other type, but there is no general solution that will work across all object types at once because the AWS API design doesn't work that way: each object type has its own separate operations for querying which objects exist and for destroying a particular object or set of objects.

TerraForm: deploying EC2 instances without starting them

I want to deploy my infrastructure in different AWS environments (dev, prod, qa).
That deployment creates a few EC2 instances from a custom AMI. When deployed, instances are in the "running" state. I understand this seems to be related to some constraint in the EC2 API. However, I don't necessarily want my instances started, depending on context. Sometimes, I just want the instances to be created, and they will be started later on. I guess this is a quite common scenario.
Reading the few related issues/requests on Hashicorp's github, makes me think so:
Terraform aws instance state change
Stop instances
aws_instance should allow to specify the instance state
There must be some TerraForm based solution which doesn't require to rely on AWS CLI / CDK or lambda, right? Something in the TerraForm script that, for example, would stop the instance right after its creation.
My google foo didn't help me much here. Any help / suggestion for dealing with that scenario is welcome.
Provisioning a new instance automatically puts it in a 'started' state.
As Marcin suggested, you can use user data scripts, here's some psuedo user data script. For you to figure out the actual implementation ;)
#!/bin/bash
get instance id, pass it to the subsequent line
aws ec2 stop-instances --instance-ids i-1234567890abcdef0
You can read about running scripts as part of the bootstrapping here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
Basically its all up to your use case. We don't do this generally. Still, if you want to provision your EC2 instances and need to put them in stopped state, as bschaatsbergen suggested, you can use the user_data in Terraform. Make sure to attach the role with relevant permission.
#!/bin/bash
INSTANCE_ID=`curl -s http://169.254.169.254/latest/meta-data/instance-id/`
REGION=`curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/.$//'`
aws ec2 stop-instances --instance-ids $INSTANCE_ID --region $REGION
As already stated by others, you cannot just "create" instances, they will be in "started" state.
Rather I would ask what is the exact use case here :
Sometimes, I just want the instances to be created, and they will be started later on.
Why you have to create instances now and use them later? Can't they be created exactly when they are required? Any specific requirement to keep them initialized before they are used? Or the instances take time to start?

EC2 init.d script - what's the best practice

I'm creating an init.d script that will run a couple of tasks when the instance starts up.
it will create a new volume with our code repository and mount it if it doesn't exist already.
it will tag the instance
The tasks above being complete will be crucial for our site (i.e. without the code repository mounted the site won't work). How can I make sure that the server doesn't end up being publicly visible? Should I start my init.d script with de-registering the instance from the ELB (I'm not even sure if it will be registered at that point), and then register it again when all the tasks finished successfully?
What is the best practice?
Thanks!
You should have a health check on your ELB. So your server shouldn't get in unless it reports as happy. And it shouldn't report happy if the boot script errors out.
(Also, you should look into using cloud-init. That way you can change the boot script without making a new AMI.)
I suggest you use CloudFormation instead. You can bring up a full stack of your system by representing it in a JSON format template.
For example, you can create an autoscale group that has an instances with unique tags and the instances have another volume attached (which presumably has your code)
Here's a sample JSON template attaching an EBS volume to an instance:
https://s3.amazonaws.com/cloudformation-templates-us-east-1/EC2WithEBSSample.template
And here many other JSON templates that you can use for your guidance and deploy your specific Stack and Application.
http://aws.amazon.com/cloudformation/aws-cloudformation-templates/
Of course you can accomplish the same using init.d script or using the rc.local file in your instance but I believe CloudFormation is a cleaner solution from the outside (not inside your instance)
You can also write your own script that brings up your stack from the outside by why reinvent the wheel.
Hope this helps.

Is there anyway to get the user data of a running EC2 instance via AWS SDK?

I've tried to use DescribeInstances but I've found the response result does not contain user data. Is there any way to retrieve this user data?
My usage case is I've trying to request spot instances and assign different user data to each EC2 instance for some kind of automation and then I want to tag the name of each instance according to this user data. Based on my understanding, creating a tag request requires InstanceId, which is not available at the time when I make a request to reserve a spot instance.
So I'm wondering whether there is any way to get the user data of a running instance instead of SSHing the instance...
The DescribeInstanceAttributes endpoint will provide you with user data.
http://docs.aws.amazon.com/AWSEC2/latest/APIReference/ApiReference-query-DescribeInstanceAttribute.html

What is a good way to access external data from aws

I would like to access external data from my aws ec2 instance.
In more detail: I would like to specify inside by user-data the name of a folder containing about 2M of binary data. When my aws instance starts up, I would like it to download the files in that folder and copy them to a specific location on the local disk. I only need to access the data once, at startup.
I don't want to store the data in S3 because, as I understand it, this would require storing my aws credentials on the instance itself, or passing them as userdata which is also a security risk. Please correct me if I am wrong here.
I am looking for a solution that is both secure and highly reliable.
which operating system do you run ?
you can use an elastic block storage. it's like a device you can mount at boot (without credentials) and you have permanent storage there.
You can also sync up instances using something like Gluster filesystem. See this thread on it.

Resources