Terraform and AWS spots instances - amazon-ec2

As described in https://github.com/hashicorp/terraform/issues/17429
After 7 days the spot request is getting cancelled and the instance is still running. So when I run the "terraform apply" its trying to create a new spot. This happens with AWS provider >=1.13.0.
I'm using AWS provider 1.32.0, does anyone know a workaround to this issue? on future installation I will use the valid_until flag which will extend the request lifetime, but what about already installed spots?
Thanks
resource "aws_spot_instance_request" "cheap_worker" {
count = "${var.kube_master_spot_num}"
ami = "${data.aws_ami.nat_ami.id}"
availability_zone ="${element(slice(data.aws_availability_zones.available.names,var.kube_master_on_demand_num,var.availability_zones_num),count.index)}"
spot_price = "3"
instance_type = "${var.kube_master_type}"
subnet_id = "${element(module.aws-vpc.aws_subnet_ids,count.index + var.kube_master_on_demand_num)}" # adjusting to a case with spots & on-demand servers
vpc_security_group_ids = [ "${module.aws-vpc.cluster_sg_id}", "${module.aws-vpc.route53_sg_id}" ]
key_name = "${basename(var.local_ssh_key)}"
associate_public_ip_address = true
root_block_device = [{volume_type="gp2",volume_size="50",delete_on_termination=true}]
spot_type = "one-time"
wait_for_fulfillment = true
tags {
Name = "${var.kube_identify}-${var.kube_type}-master-${count.index}"
}
provisioner "local-exec" {
command = "sleep 120"
}
connection {
type = "ssh"
user = "ubuntu"
private_key = "${file(var.local_ssh_key)}"
}
provisioner "remote-exec" {
inline = [
"sudo apt-get update"
]
}
}

Related

Terraform starting EC2 sometimes stuck on "Still creating" until timeout

I am running a terraform through Jenkins which starts up an ec2 then runs a shell script on it using user_data. I run this job 23 times in parallel, and for some reason each time only a few of them (anywhere from 1 to 8 and always different indices) will hang on "aws_instance.genomic-etl-ec2: Still creating..." until the connection times out after approximately an hour and throws a RequestExpired error, with no further details on why. The other instances start fine within around 2-3 minutes each.
My resource:
data "template_file" "my-user_data" {
template = file("scripts/my_script.sh")
}
data "template_cloudinit_config" "my-user-data" {
gzip = true
base64_encode = true
# user_data
part {
content_type = "text/x-shellscript"
content = data.template_file.my-user_data.rendered
}
}
resource "aws_instance" "genomic-etl-ec2" {
ami = var.ami-id
instance_type = "m5.12xlarge"
associate_public_ip_address = true
subnet_id = var.my-subnet-us-east-id
iam_instance_profile = "my-deployment-profile"
user_data = data.template_cloudinit_config.my-user-data.rendered
vpc_security_group_ids = [
aws_security_group.my-sg1.id,
aws_security_group.my-sg2.id
]
root_block_device {
delete_on_termination = true
encrypted = true
volume_size = 1000
}
provisioner "local-exec" {
command = "sleep 40"
}
tags = {
Owner = "Me"
Environment = "development"
Name = "My EC2 - ${id}"
automaticPatches = "1"
}
}
Sometimes AWS instances take a long time to become fully available. It's not uncommon for those to take longer than Terraform's default timeout, causing Terraform to fail.
As per the official documentation on the Terraform aws_instance resource, the create timeout defaults to 10 minutes. If a particular instance type is taking longer than 10 minutes to become available, then you need to increase the create timeout setting:
resource "aws_instance" "genomic-etl-ec2" {
# ...
timeouts {
create = "20m"
}
}

How to uploa local file to the ec2 instance with the module terraform-aws-modules/ec2-instance/aws?

How to upload local file to the ec2 instance with the module terraform-aws-modules/ec2-instance/aws?
I placed provisioner inside module "ec2". It does not work.
I placed provisioner outsite of the module "ec2". It does not work either.
I got the error: "Blocks of type "provisioner" are not expected here".
"provisioner" is inside module "ec2". It does not work.
module "ec2" {
source = "terraform-aws-modules/ec2-instance/aws"
version = "4.1.4"
name = var.ec2_name
ami = var.ami
instance_type = var.instance_type
availability_zone = var.availability_zone
subnet_id = data.terraform_remote_state.vpc.outputs.public_subnets[0]
vpc_security_group_ids = [aws_security_group.sg_WebServerSG.id]
associate_public_ip_address = true
key_name = var.key_name
provisioner "file" {
source = "./foo.txt"
destination = "/home/ec2-user/foo.txt"
connection {
type = "ssh"
user = "ec2-user"
private_key = "${file("./keys.pem")}"
host = module.ec2.public_dns
}
}
}
"provisioner" is outsite of the module "ec2". It does not work.
module "ec2" {
source = "terraform-aws-modules/ec2-instance/aws"
version = "4.1.4"
name = var.ec2_name
ami = var.ami
instance_type = var.instance_type
availability_zone = var.availability_zone
subnet_id = data.terraform_remote_state.vpc.outputs.public_subnets[0]
vpc_security_group_ids = [aws_security_group.sg_WebServerSG.id]
associate_public_ip_address = true
key_name = var.key_name
}
provisioner "file" {
source = "./foo.txt"
destination = "/home/ec2-user/foo.txt"
connection {
type = "ssh"
user = "ec2-user"
private_key = "${file("./keys.pem")}"
host = module.ec2.public_dns
}
}
You can use a null ressource to make it work!
resource "null_resource" "this" {
provisioner "file" {
source = "./foo.txt"
destination = "/home/ec2-user/foo.txt"
connection {
type = "ssh"
user = "ec2-user"
private_key = "${file("./keys.pem")}"
host = module.ec2.public_dns
}
}
You can provision files on an EC2 instance with the YAML cloud-init syntax which is passed to the EC2 instance as user-data. Here is an example of passing cloud-init config to EC2.
cloud-init.yaml file:
#cloud-config
# vim: syntax=yaml
#
# This is the configuration syntax that the write_files module
# will know how to understand. Encoding can be given b64 or gzip or (gz+b64).
# The content will be decoded accordingly and then written to the path that is
# provided.
#
# Note: Content strings here are truncated for example purposes.
write_files:
- content: |
# Your TXT file content...
# goes here
path: /home/ec2-user/foo.txt
owner: ec2-user:ec2-user
permissions: '0644'
Terraform file:
module "ec2" {
source = "terraform-aws-modules/ec2-instance/aws"
version = "4.1.4"
name = var.ec2_name
ami = var.ami
instance_type = var.instance_type
availability_zone = var.availability_zone
subnet_id = data.terraform_remote_state.vpc.outputs.public_subnets[0]
vpc_security_group_ids = [aws_security_group.sg_WebServerSG.id]
associate_public_ip_address = true
key_name = var.key_name
user_data = file("./cloud-init.yaml")
}
The benefits of this approach over the approach in the accepted answer are:
This method creates the file immediately at instance creation, instead of having to wait for the instance to come up first. The null-provisioner/SSH connection method has to wait for the EC2 instance to be become available, and the timing of that could cause your Terraform workflow to become flaky.
This method doesn't require the EC2 instance to be reachable from your local computer that is running Terraform. You could be deploying the EC2 instance to a private subnet behind a load balancer, which would prevent the null-provisioner/SSH connect method from working.
This doesn't require you to have the SSH key for the EC2 instance available on your local computer. You might want to only allow AWS SSM connect to your EC2 instance, to keep it more secure than allowing SSH directly from the Internet, and that would prevent the null-provisioner/SSH connect method from working. Further, storing or referencing an SSH private key in your Terraform state adds a risk factor to your overall security profile.
This doesn't require the use of a null_resource provisioner, which the Terraform documentation states:
Important: Use provisioners as a last resort. There are better alternatives for most situations. Refer to Declaring Provisioners for more details.

Provisioning Windows VM including File Provisioner for AWS using Terraform results in Timeout

I'm aware that there already exists several posts similar to this one - I've went through them and adapted my Terraform configuration file, but it makes no difference.
Therefore, I'd like to publish my configuration file and my use case: I'd like to provision a (Windows) Virtual Machine on AWS, using Terraform. It works without the File Provisioning part - including them, the provisioning results in a timeout.
This includes adaptations from previous posts:
SSH connection restriction
SSH isnt working in Windows with Terraform provisioner connection type
Usage of a Security group
Terraform File provisioner can't connect ec2 over ssh. timeout - last error: dial tcp 92.242.xxx.xx:22: i/o timeout
I also get a timeout when using "winrm" instead of "ssh".
I'd be happy if you could provide any hint for following config file:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
}
# Configure the AWS Provider
provider "aws" {
access_key = "<my access key>"
secret_key = "<my secret key>"
region = "eu-central-1"
}
resource "aws_instance" "webserver" {
ami = "ami-07dfec7a6d529b77a"
instance_type = "t2.micro"
security_groups = [aws_security_group.sgwebserver.name]
key_name = aws_key_pair.pubkey.key_name
tags = {
"Name" = "WebServer-Win"
}
}
resource "null_resource" "deployBundle" {
connection {
type = "ssh"
user = "Administrator"
private_key = "${file("C:/Users/<my user name>/aws_keypair/aws_instance.pem")}"
host = aws_instance.webserver.public_ip
}
provisioner "file" {
source = "files/test.txt"
destination = "C:/test.txt"
}
depends_on = [ aws_instance.webserver ]
}
resource "aws_security_group" "sgwebserver" {
name = "sgwebserver"
description = "Allow ssh inbound traffic"
ingress {
from_port = 0
to_port = 6556
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "sgwebserver"
}
}
resource "aws_key_pair" "pubkey" {
key_name = "aws-cloud"
public_key = file("key/aws_instance.pub")
}
resource "aws_eip" "elasticip" {
instance = aws_instance.webserver.id
}
output "eip" {
value = aws_eip.elasticip.public_ip
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "my-vpc"
cidr = "10.0.0.0/16"
azs = ["eu-central-1a", "eu-central-1b", "eu-central-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = true
tags = {
Terraform = "true"
Environment = "dev"
}
}
Thanks a lot in advance!
Windows EC2 instances don't support SSH, they support RDP. You would have to install SSH server software on the instance before you could SSH into it.
I suggest doing something like placing the file in S3, and using a user data script to trigger the Windows EC2 instance to download the file on startup.

Decrypting Windows Password in terraform

I'm trying to set up a Terraform script to deploy a windows server. When running terraform apply I get an error message referencing below
Error: Invalid reference
on main.tf line 44, in resource "aws_instance" "server":
44: password = "${rsadecrypt(aws_instance.server[0].password_data, file(KEY_PATH))}"
A reference to a resource type must be followed by at least one attribute
access, specifying the resource name.
AFAIK the resource is "aws_instance", the name is "server[0]" while the attribute is the "password_data". I know I'm missing something but don't know what. any assistance would be appreciated.
The full resource module is below in case there is something I'm missing contained in there.
Thanks
resource "aws_instance" "server" {
ami = var.AMIS[var.AWS_REGION]
instance_type = var.AWS_INSTANCE
vpc_security_group_ids = [module.networking.security_group_id_out]
subnet_id = module.networking.subnet_id_out
## Use this count key to determine how many servers you want to create.
count = 1
key_name = var.KEY_NAME
tags = {
# Name = "Server-Cloud"
Name = "Server-${count.index}"
}
root_block_device {
volume_size = var.VOLUME_SIZE
volume_type = var.VOLUME_TYPE
delete_on_termination = true
}
get_password_data = true
provisioner "remote-exec" {
connection {
host = coalesce(self.public_ip, self.private_ip)
type = "winrm"
## Need to provide your own .pem key that can be created in AWS or on your machine for each provisioned EC2.
password = ${rsadecrypt(aws_instance.server[0].password_data, file(KEY_PATH))}
}
inline = [
"powershell -ExecutionPolicy Unrestricted C:\\Users\\Administrator\\Desktop\\installserver.ps1 -Schedule",
]
}
provisioner "local-exec" {
command = "echo ${self.public_ip} >> ../public_ips.txt"
}
}
Use password = "${rsadecrypt(self.password_data, file("/root/.ssh/id_rsa"))}"
without user = "admin" as below :
resource "aws_instance" "windows_server" {
get_password_data = "true"
connection {
host = "${self.public_ip}"
type = "winrm"
https = false
password = "${rsadecrypt(self.password_data, file("/root/.ssh/id_rsa"))}"
agent = false
insecure = "true"
}
}

Terraform fails remote-exec (aws/ec2)

When trying to execute a shell script throw provisioner "remote-exec" in terraform connection not establish
I'm using ami for ubuntu-xenial-16.04 so the user is ubuntu
This is the last code that I use to execute the shell script:
resource "aws_instance" "secondary_zone" {
count = 1
instance_type = "${var.ec2_instance_type}"
ami = "${data.aws_ami.latest-ubuntu.id}"
key_name = "${aws_key_pair.deployer.key_name}"
subnet_id = "${aws_subnet.secondary.id}"
vpc_security_group_ids = ["${aws_security_group.server.id}"]
associate_public_ip_address = true
provisioner "remote-exec" {
inline = ["${template_file.script.rendered}"]
}
connection {
type = "ssh"
user = "ubuntu"
private_key = "${file("~/.ssh/id_rsa")}"
}
}
This is what get in console:
aws_instance.secondary_zone (remote-exec): Connecting to remote host via SSH...
aws_instance.secondary_zone (remote-exec): Host: x.x.x.x
aws_instance.secondary_zone (remote-exec): User: ubuntu
aws_instance.secondary_zone (remote-exec): Password: false
aws_instance.secondary_zone (remote-exec): Private key: true
aws_instance.secondary_zone (remote-exec): SSH Agent: false
aws_instance.secondary_zone (remote-exec): Checking Host Key: false
Thank you for your help...
I had the same issue. In your connection block try specifying the host.
connection {
type = "ssh"
user = "ubuntu"
private_key = "${file("~/.ssh/id_rsa")}"
host = self.public_ip
}
I also had to create a route & gateway and associate them to my vpc. I'm still learning terraform, but this worked for me.
resource "aws_internet_gateway" "test-env-gw" {
vpc_id = aws_vpc.test-env.id
}
resource "aws_route_table" "route-table-test-env" {
vpc_id = aws_vpc.test-env.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.test-env-gw.id
}
}
resource "aws_route_table_association" "subnet-association" {
subnet_id = aws_subnet.us-east-2a-public.id
route_table_id = aws_route_table.route-table-test-env.id
}
As I mentioned, it was connecting problem in my case.
In addition template_file was deprecated so I change the code to:
resource "aws_instance" "secondary_zone" {
instance_type = "${var.ec2_instance_type}"
ami = "${data.aws_ami.latest-ubuntu.id}"
key_name = "${aws_key_pair.deployer.key_name}"
subnet_id = "${aws_subnet.secondary.id}"
vpc_security_group_ids = ["${aws_security_group.server.id}"]
associate_public_ip_address = true
connection {
type = "ssh"
user = "ubuntu"
private_key = "${file("~/.ssh/id_rsa")}"
timeout = "2m"
}
provisioner "file" {
source = "/server/script.sh"
destination = "/tmp/script.sh"
}
provisioner "remote-exec" {
inline = [
"chmod +x /tmp/script.sh",
"/tmp/script.sh args",
]
}
}
Also, I learned that the scrip.sh have to be formatted as LR
If you're just trying to run some scripts to provision whichever ec2 nodes you create with Terraform, I would try setting the user-data parameter to reference your script. The user-data script is run automatically when the node initializes.
This will ensure that there are no lifecycle related issues with your deployment (such as EC2 node being created and the host not being available for the remote exec to succeed) and an overall cleaner experience.
An example of this can look like this:
resource "aws_instance" "secondary_zone" {
count = 1
instance_type = "${var.ec2_instance_type}"
ami = "${data.aws_ami.latest-ubuntu.id}"
key_name = "${aws_key_pair.deployer.key_name}"
subnet_id = "${aws_subnet.secondary.id}"
vpc_security_group_ids = ["${aws_security_group.server.id}"]
associate_public_ip_address = true
user_data = "${template_file.script.rendered}"
}
Hope this helps!
Further reading:
TF docs
Userdata examples

Resources