Conditional expressions with Terraform deploying AWS resources - bash

My current Terraform setup consists of 3 template files. Each template file is linked to a launch configuration resource which is then used to launch instances via auto scaling events. In each template file, there is an AWS CLI command used to attach an existing EBS volume to the new instance being launched via autoscaling. I am having some trouble writing a conditional expression to pass in a variable to this AWS CLI command being used to attach a specific drive. Since I have 3 template files and 3 EBS volumes I'm looking to attach to each instance in its own autoscaling group, I don't believe I can have more than 2 expressions within my conditional expression. Any advice on how I can achieve this would be helpful.
Template_file
data "template_file" "ML_10_user_data" {
count = "${(var.enable ? 1 : 0) * var.number_of_zones}" // 3 templates
template = "${file("userdata.sh")}
vars {
ebs_volume = "${count.index == 0 ? vol-xxxxxxxxxxxxxxxxx : vol-xxxxxxxxxxxxxxxxx}" // how to include 3rd EBS volume?
}
}
Userdata.sh
#!/bin/bash
aws ec2 attach-volume --volume-id ${EBS_VOLUME} --instance_id `curl http://169.254.169.254/latest/meta-data/instance-id` --device /dev/sdf
EBS_VOLUME=${ebs_volume}
Any advice on how I can fulfill what I am trying to accomplish would be appreciated.

The best way would be to put it in a list:
variable "volumes" {
default = ["vol-1111","vol-2222","vol-333"]
}
data "template_file" "user_data" {
count = "${(var.enable ? 1 : 0) * var.number_of_zones}"
template = "${file("userdata.sh")}"
vars {
ebs_volume = "${var.volumes[count.index]}"
}
}
However, if these instances should be in ASG, this is not a very good design. Instances in ASG should be identical, interchangeable and disposable. They can be terminated at any time by AWS or scaling activities, and you should treat those instances as a group, not as individual entities.

Related

AWS-CDK: Cross account Resource Access and Resource reference

I have a secret key-value pair in Secrets Manager in Account-1 in us-east-1. This secret is encrypted using a Customer managed KMS key - let's call it KMS-Account-1. All this has been created via console.
Now we turn to CDK. We have cdk.pipelines.CodePipeline which deploys Lambda to multiple stages/environments - so 1st to { Account-2, us-east-1 } then to { Account-3, eu-west-1 } and so on. This has been done.
The lambda code in all stages/environments above, now needs to be changed to use the secret key-value pair present with Account-1's us-east-1 SecretsManager by getting it via secretsmanager client. That code should probably look like this (python):
client = boto3.session.Session().client(
service_name = 'secretsmanager',
region_name = 'us-east-1'
)
resp = client.get_secret_value(
SecretId='arn:aws:secretsmanager:us-east-1:<ACCOUNT-1>:secret:name/of/the/secret'
)
secret = json.loads(resp['SecretString'])
All lambdas in various accounts and regions (ie. environments) will have the exact same code as above since the secret needs to be fetched from Account-1 in us-east-1.
Firstly I hope this is conceptually possible. Is that right?
Next how do I change the cdk code to facilitate this? How will the code-deploy in code-pipeline get permission to import this custom kms key and SecretManager' secretand apply correct permissions for cross account access by the lambdas that the cdk pipeline creates ?
Can someone please give some pointers?
This is bit tricky as CloudFormation, and hence CDK, doesn't allow cross account/cross stage references because CloudFormation export doesn't work cross account as far as my understanding goes. All these patterns of "centralised" resources fall into that category - ie. resource in one account (or a stage in CDK) referenced by other stages.
If the resource is created outside the context of CDK (like via console), then you might as well hardcode the names/arns/etc. throughout the CDK code where its used and that should be sufficient.
For resources that have the ability to hold resource based policies, it's simpler as you can just attach the cross-account access permissions to them directly - again, offline via console since you are maintaining it manually anyway. Each time you add a stage (account) to your pipeline, you will need to go to the resource and add cross-account permissions manually.
For resources that don't have resource based policies, like SSM for eg., things are a bit roundabout as you will need to create a Role that can be assumed cross-account and then access the resource. In that case you will have to separately maintain the IAM Role too and manually update the trust policy to other accounts as you add stages to your CDK pipeline. Then, as usual hardcode the role arn in your CDK code, assume it in some CustomResource lambda and use it.
It gets more interesting if the creation is also done in the CDK code itself (ie. managed by CloudFormation - not done separately via console/aws-cli etc.). In this case, many times you wouldn't "know" the exact ARNs as the physical-id would be generated by CloudFormation and likely be a part of the ARN. Even influencing the physical-id yourself (like by hardcoding the bucket name) might not solve it in all cases. Eg. KMS ARNs and SecretManager ARNs append unique-ids or some sort of hashes to the end of the ARN.
Instead of trying to work all that out, it would be best left untouched and let CFn generate whatever random name/arn it chooses. To then reference these constructs/ARNs, just put them into SSM Parameters in the source/central account. SSM doesn't have resource based policy that I know of. So additionally create a role in cdk that trusts the accounts in your cdk code. Once done, there is no more maintenance - each time you add new environments/accounts to CDK (assuming its a cdk pipeline here), the "loop" construct that you will create will automatically add the new account into the trust relationship.
Now all you need to do is to distribute this role-arn and the SSM Parameternames to other stages. Choose an explicit role-name and SSM Parameters. The manual ARN construction given a rolename is pretty straightforward. So distribute that and SSM Parameters around your CDK code to other stages (compile time strings instead of references). In target stages, create custom-resource(s) (AWSCustomResource) backed by AwsSdkCall lambda to simply assume this role-arn and make the SDK call to retrieve the SSM Parameter values. These values can be anything, like your KMS ARNs, SecretManager's full ARNs etc. which you couldn't easily guess. Now simply use these.
Roundabout way to do a simple thing, but so far that is all I could do to get this to work.
#You need to maintain this list no matter what you do - so it's nothing extra
all_other_accounts = [ <list of accounts that this cdk deploys to> ]
account_principals = [iam.AccountPrincipal(a) for a in all_other_account]
role = iam.Role(
assumed_by = iam.CompositePrincipal(*account_principals), #auto-updated as you change the list above
role_name = some_explicit_name,
...
)
role_arn = f'arn:aws:iam::<account-of-this-stack>:role/{some_explicit_name}'
kms0 = kms.Key(...)
kms0.grant_decrypt(role)
# Because KMS also needs explicit resource policy even if role policy allows access to it
kms0.add_to_role_policy(iam.PolicyStatement(principals = [iam.ArnPrincipal(role_arn)], actions = ...))
kms1 = kms.Key(...)
kms1.grant_decrypt(role)
kms0.add_to_role_policy(... same as above ...)
secrets0 = secretsmanager.Secret(...) #maybe this is based off kms0
secrets0.grant_read(role)
secrets1 = secretsmanager.Secret(...) #maybe this is based off kms1
secrets1.grant_read(role)
# You can turn all this into a loop ofc.
ssm0 = ssm.StingParameter(self, '...', parameter_name = 'kms0_arn', string_value = kms0.key_arn, ...)
ssm0.grant_read(role)
ssm1 = ssm.StingParameter(self, '...', parameter_name = 'kms1_arn', string_value = kms1.key_arn, ...)
ssm1.grant_read(role)
ssm2 = ssm.StingParameter(self, '...', parameter_name = 'secrets0_arn', string_value = secrets0.secret_full_arn, ...)
ssm2.grant_read(role)
...
#Now simply pass around the role and ssm parameter names
for env in environments:
MyApplicationStage(self, <...>, ..., role_arn = role_arn, params = [ 'kms0_arn', 'kms1_arn', ... ], ...)
And then in the target stage(s):
for parm in params:
fn = AwsSdkCall('ssm', 'get_parameter', { "Name": param }, ...)
acr = AwsCustomResource(..., on_create = fn, on_update = fn, ...)
collect['param'] = acr.get_response_field('Parameter.Value')
Now do whatever you want with the collected artifacts, including supplying them as environment variables to your main service lambda (which will be resolved at deploy time).
Remember they will all be Tokens and resolved only at deploy time, but that's true of any resource, whether or not via custom-resource and it shouldn't matter.
That's a generic pattern which should work for any case.
(GitHub link where this question was asked and I had answered it there too)

Error creating ElasticSearch domain using Terraform - Exactly one subnet needed

I may have come across some conflicting docs today.
When creating an Elastic Search domain with vpc options, the HashiCorp Terraform official docs (https://www.terraform.io/docs/providers/aws/r/elasticsearch_domain.html) say that the subnets is a list and even specify 2 subnets in their example. However when I specify 2 subnets I get an error (I tried 2 different ways of specifying the subnets list) -
vpc_options {
subnet_ids = "${var.private_subnet_ids}"
OR
subnet_ids = [
"${var.private_subnet_ids[0]}",
"${var.private_subnet_ids[1]}"
]
Both of them give me the same error -
Error: Error creating ElasticSearch domain: ValidationException: You must specify exactly one subnet.
status code: 400, request id: 98b49b34-2da8-11ea-8114-e9488cc7cb63
on modules/es/main.tf line 51, in resource "aws_elasticsearch_domain" "es":
51: resource "aws_elasticsearch_domain" "es" {
If I specify a single subnet, it works fine.
subnet_ids = ["${var.private_subnet_ids[0]}"]
I do however want to be able to specify both of my private subnets for the ES cluster.
Is there a way to do that ? I noticed a couple issues on github for this but the resolution was what was in the Terraform docs and that does not work for me. I am using v0.12.17 in case it matters.
variable private_subnet_ids is a list
variable "private_subnet_ids" {
type = "list"
description = "The list of private subnets to place the instances in"
}
The original AWS documentation (https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-vpc.html) explains the structure behind the Terraform behavior. Each ES Domain can only connect to one subnet if it resides in a single Availability Zone. If you have Multi_AZ mode enabled in ES, you can supply a second subnet, which then must be in the other AZ that the ES cluster spans as well.

Script to AWS Lambda?

I have this bash script which lists all active EC2 instances across regions:
for region in `aws ec2 describe-regions --output text | cut -f3` do
echo -e "\nListing Instances in region:'$region'..."
aws ec2 describe-instances --region $region
done
I'd like to port this to a Lambda function on AWS. What would be the best approach today? Do I have to use a wrapper or similar? Node? I googled and found what looked like mostly workarounds.. but they were a couple of years old. Would appreciate an up-to-date indication.
Two ways to do it:
Using custom runtimes and layers: https://github.com/gkrizek/bash-lambda-layer
'Exec'ing from another runtime: https://github.com/alestic/lambdash
You should write it in a language that has an AWS SDK, such as Python.
You should also think about what the Lambda function should do with the output, since at the moment it just retrieves information but doesn't do anything with it.
Here's a sample AWS Lambda function:
import boto3
def lambda_handler(event, context):
instance_ids = []
# Get a list of regions
ec2_client = boto3.client('ec2')
response = ec2_client.describe_regions()
# For each region
for region in response['Regions']:
# Get a list of instances
ec2_resource = boto3.resource('ec2', region_name=region['RegionName'])
for instance in ec2_resource.instances.all():
instance_ids.append(instance.id)
# Return the list of instance_ids
return instance_ids
Please note that it takes quite a bit of time to call all the regions sequentially. The above can take 15-20 seconds.

terraform destroy doesn't delete the ec2 instance created using input parameters for variables

I tried launching an ec2 instance using input parameters for the variables in terraform apply command. This creates the instance successfully. However, when I try to delete the instance using terraform destory, it executes but nothing gets deleted.
So I have a region variable with a default value. When I pass a different region in this variable using input parameters,instance launchesjust fine in the provided region but I am not able to terminate it using terraform destroy.
main.tf
variable "region" {
default = "us-west-1"
}
variable "ami" {
type = "map"
default = {
us-east-2 = "ami-02e680c4540db351e"
us-west-1 = "ami-011b6930a81cd6aaf"
}
}
provider "aws" {
region = "${var.region}"
}
resource "aws_instance" "web" {
ami = "${lookup(var.ami,var.region)}"
instance_type = "t2.micro"
tags {
Name = "naxi"
}
}
Terraform apply:
terraform apply -var region=us-east-2
Output of terraform destroy :
aws_instance.web: Refreshing state... (ID: i-05ca0514f61dcaf16)
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
Destroy complete! Resources: 0 destroyed.
Though it's able to lookup the instance id in the correct region, my guess is that it is trying to terminate the instance from the default region and not from the one I supplied as parameter.
Is there a way I can supply a parameter -var region=something with terraform destroy?
Destroy works as expected if I use the default values and no input parameters.
EDIT---
As soon as I give the this command: terraform destroy -varfile=variables.tfvars, all the instance related information from terraform.tfstate file gets removed and all the previous content of this file gets saved as backup to terraform.tfstate.backup. But still the instance is not deleted.
I think this is your main problem:
You ran apply with your "aws" provider defined one way (via a variable), but then you ran destroy with the same "aws" provider defined differently (you let the "region" variable default instead of specifying it).
As a result, terraform destroy looked in the wrong place (wrong AWS region) for your created resources.
Since terraform destroy was looking in the wrong place, it found nothing there.
Therefore terraform destroy saw that it did not need to destroy anything, just update its locally stored state information to reflect the absence of the resources.
Try these steps instead:
terraform apply -var 'region=us-east-2'
terraform destroy -var 'region=us-east-2'
This works for me, Terraform v0.12.2 + provider.aws v2.16.0.
I am guessing slightly here, but it seems like the point is probably that you, the Terraform user, are responsible for making sure you destroy with the exact same provider definitions you apply'd with.
And if you're using any variables to help define your providers, then this is something you will need to be especially mindful of, since you are making it easy to accidentally change provider definitions.
As a side note, I ran into a similar confusion myself. It seems to me that HashiCorp's Getting Started guide, in its current state, could do a better job of warning about this. It walks newbies through a very similar setup to yours, and currently appears to say nothing about how to destroy properly, or any potential pitfalls.
Perhaps you have multiple providers set. Try aliasing your provider and passing that into the resource.
provider "aws" {
region = var.region
alias = "mine"
}
resource "aws_instance" "web" {
provider = aws.mine
}

Ruby calling AWS ELB functions

I'm writing some Ruby scripts to wrap AWS ELB command line calls, mostly so that I can act on several ELB instances simultaneously. One task is to use the elb-describe-instance-health call to see what instance IDs are attached to this ELB.
I want to match the Instance ID to a nickname we have set up for those instances, so that I can see at a glance what machines area connected to the ELB, without having to look up the instance names.
So I am issuing:
cmd = "elb-describe-instance-health #{elbName}"
value = `#{cmd}`
Passing the elb name into the call. This returns output such as:
INSTANCE_ID i-jfjtktykg InService N/A N/A
INSTANCE_ID i-ujelforos InService N/A N/A
One line appear for each instance in the ELB. There are two spaces between each field.
What I need to get is the second field, which is the actual instance ID. Basically I'm trying to get each line returned, turn it into an array, get the 2nd field, which I can then use to lookup our server nickname.
Not sure if this is the right approach, but any suggestions on how to get this done are very welcome.
The newly released aws-sdk gem supports Elastic Load Balancing (AWS::ELB). If you want to get a list of instance ids attached to your load balancer you can do the following:
AWS.config(:access_key_id => '...', :secret_access_key => '...')
elb = AWS::ELB.new
intsance_ids = elb.load_balancers['LOAD_BALANCER_NAME'].instances.collect(&:id)
You could also use EC2 to store your instance nicknames.
ec2 = AWS::EC2.new
ec2.instances['INSTANCE_ID'].tags['nickname'] = 'NICKNAME'
Assuming your instances are tagged with their nicknames, you could collect them like so:
elb = AWS::ELB.new
elb.load_balancers['LOAD_BALANCER_NAME'].instances.collect{|i| i.tags['nickname'] }
A simple way to extract the second column would be something like this:
ids = value.split("\n").collect { |line| line.split(/\s+/)[1] }
This will leave the second column values in the Array ids. All this does is breaks the value into lines, breaks each line into whitespace delimited columns, and then extracts the second column.
There's probably no need to try to be too clever for something like this, a simple and straight forward solution should be sufficient.
References:
collect
split

Resources