Error creating ElasticSearch domain: ValidationException: Authentication error - elasticsearch

I have been getting this error lately while creating a ES domain using Terraform. Nothing has changed in the way I define the ES domain. I did however start using SSL (AWS ACM cert) on the ALB layer but that should not have affected this. Any ideas what it might be complaining about ?
resource "aws_elasticsearch_domain" "es" {
domain_name = "${var.es_domain}"
elasticsearch_version = "6.3"
cluster_config {
instance_type = "r4.large.elasticsearch"
instance_count = 2
zone_awareness_enabled = true
}
vpc_options {
subnet_ids = "${var.private_subnet_ids}"
security_group_ids = [
"${aws_security_group.es_sg.id}"
]
}
ebs_options {
ebs_enabled = true
volume_size = 10
}
access_policies = <<CONFIG
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "es:*",
"Principal": "*",
"Effect": "Allow",
"Resource": "arn:aws:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${var.es_domain}/*"
}
]
}
CONFIG
snapshot_options {
automated_snapshot_start_hour = 23
}
tags = {
Domain = "${var.es_domain}"
}
depends_on = [
"aws_iam_service_linked_role.es",
]
}
resource "aws_iam_service_linked_role" "es" {
aws_service_name = "es.amazonaws.com"
}
EDIT: Oddly enough, when I removed using the ACM cert and moved back to using HTTP (port 80) for my ALB Listener, the ES domain was provisioned.
Not sure what to make of this but clearly the ACM cert is interfering with the ES domain creation. Or I am doing something wrong with the ACM creation. Here is how I do it and use it -
resource "aws_acm_certificate" "ssl_cert" {
domain_name = "api.xxxx.io"
validation_method = "DNS"
tags = {
Environment = "development"
}
lifecycle {
create_before_destroy = true
}
}
resource "aws_alb_listener" "alb_listener" {
load_balancer_arn = "${aws_alb.alb.id}"
port = "443"
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-2016-08"
certificate_arn = "${aws_acm_certificate.ssl_cert.arn}"
default_action {
target_group_arn = "${aws_alb_target_group.default.id}"
type = "forward"
}
}
The cert is validated and issued by AWS pretty fast as far as I can see in the console. And as seen, it has nothing to do with the ES domain per say.

It sometimes occurs that when it creates an ES-domain before enabling a service-linked role, even though using depends_on.
maybe you can try using local-exec provisioner to wait.
resource "aws_iam_service_linked_role" "es" {
aws_service_name = "es.amazonaws.com"
provisioner "local-exec" {
command = "sleep 10"
}
}

Below one is enough for service-linked role creation, also incl the role in the depends_on
resource "aws_iam_service_linked_role" "es" {
aws_service_name = "es.amazonaws.com"
}

Related

How to solve "The IAM role configured on the integration or API Gateway doesn't have permissions to call the integration

I have a lambda function and an apigatewayv2. I am creating everything via terraform as below.
resource "aws_lambda_function" "prod_options" {
description = "Production Lambda"
environment {
variables = var.prod_env
}
function_name = "prod-func"
handler = "index.handler"
layers = [
aws_lambda_layer_version.node_modules_prod.arn
]
memory_size = 1024
package_type = "Zip"
reserved_concurrent_executions = -1
role = aws_iam_role.lambda_exec.arn
runtime = "nodejs12.x"
s3_bucket = aws_s3_bucket.lambda_bucket_prod.id
s3_key = aws_s3_bucket_object.lambda_node_modules_prod.key
source_code_hash = data.archive_file.lambda_node_modules_prod.output_base64sha256
timeout = 900
tracing_config {
mode = "PassThrough"
}
}
and role
resource "aws_iam_role_policy_attachment" "lambda_policy" {
role = aws_iam_role.lambda_exec.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
resource "aws_iam_role" "lambda_exec" {
name = "api_gateway_role"
assume_role_policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": [
"apigateway.amazonaws.com",
"lambda.amazonaws.com"
]
},
"Action": "sts:AssumeRole"
}
]
})
}
and then permissions
resource "aws_lambda_permission" "prod_api_gtw" {
statement_id = "AllowExecutionFromApiGateway"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.prod_options.function_name
principal = "apigateway.amazonaws.com"
source_arn = "${aws_apigatewayv2_api.gateway_prod.execution_arn}/*/*"
}
After I deploy and try to invoke the url , I gget the following error
"integrationErrorMessage": "The IAM role configured on the integration or API Gateway doesn't have permissions to call the integration. Check the permissions and try again.",
I've been stuck with this for a while now. How can I solve this error?
You may have to create a Lambda permission to allow execution from an API Gateway resource:
resource "aws_lambda_permission" "apigw_lambda" {
statement_id = "AllowExecutionFromAPIGateway"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.layout_editor_prod_options.function_name
principal = "apigateway.amazonaws.com"
# The /*/*/* part allows invocation from any stage, method and resource path
# within API Gateway REST API.
source_arn = "${aws_api_gateway_rest_api.rest_api.execution_arn}/*/*/*"
}
Also, for the Lambda lambda_exec, you don't need apigateway.amazonaws.com principal. The reason why we don't need this is that the execution role applies to the function and allows it to interact with other AWS services. In the other hand, this wont allow anything for the API Gateway, for that we need a Lambda permission.
resource "aws_iam_role" "lambda_exec" {
name = "lambda_exec_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
In the other hand, I would add a policy to the Lambda execution role to be able to log to CloudWatch. This might be useful for further debugging:
resource "aws_iam_policy" "lambda_logging" {
name = "lambda_logging"
path = "/"
description = "IAM policy for logging from a lambda"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*",
"Effect": "Allow"
}
]
}
EOF
}
resource "aws_iam_role_policy_attachment" "lambda_logs" {
role = aws_iam_role.lambda_exec.name
policy_arn = aws_iam_policy.lambda_logging.arn
}

Termination Reason: Client.InternalError: Client error on launch

please help
How to make sure EC2 is using the custom KMS key; we are using terraform to deploy the EC2 instance, every time the EC2 instance is launched in an auto-scaling group, it crashes with the below error. Seems like the EC2 instance has no access to KMS key
Error: Termination Reason: Client.InternalError: Client error on launch
resource "aws_autoscaling_group" "autoscaling-group" {
name = var.name
availability_zones = var.availability_zones
min_size = var.min_size
desired_capacity = var.desired_capacity
max_size = var.max_size
health_check_type = "EC2"
launch_configuration = aws_launch_configuration.launch_configuration.name
vpc_zone_identifier = local.subnet_id
termination_policies = ["OldestInstance"]
}
resource "aws_launch_configuration" "launch_configuration" {
name = var.name
image_id = var.ami
instance_type = var.instance_type
iam_instance_profile = var.iam_instance_profile_name
security_groups = [aws_security_group.security_group.id]
associate_public_ip_address = true
}
resource "aws_autoscaling_policy" "autoscaling-policy" {
name = var.name
policy_type = "TargetTrackingScaling"
estimated_instance_warmup = "90"
adjustment_type = "ChangeInCapacity"
autoscaling_group_name = aws_autoscaling_group.autoscaling-group.name
}
--
Thank you
Thank you All for your support, I was able to resolve; The issue was with the kms key grant for ec2 auto scaling service
we used the below module and the issue got resolved
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_grant
resource "aws_kms_grant" "a" {
name = "my-grant"
key_id = aws_kms_key.a.key_id
grantee_principal = aws_iam_role.a.arn
operations = ["Encrypt", "Decrypt", "GenerateDataKey"]
}
This is probably happening because the Auto Scaling Group can't attach the EBS volume to your EC2 instance. Looks like you chose your EBS volume to be encrypted, but the key policy in your customer managed key in KMS is not having the correct policy for the specific IAM role Autoscaling use, that is AWSServiceRoleForAutoScaling. You need to add the below policy blocks under the key policy for the KMS key used to encrypt the EBS volume:
{
"Sid": "Allow use of the key",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::*<AWS Account Number>*:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling"
]
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Sid": "Allow attachment of persistent resources",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::*<AWS Account Number>*:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling"
]
},
"Action": [
"kms:CreateGrant",
"kms:ListGrants",
"kms:RevokeGrant"
],
"Resource": "*",
"Condition": {
"Bool": {
"kms:GrantIsForAWSResource": "true"
}
}
}
You can execute the plan with TF_LOG=DEBUG to get more details about what is missing. You mostly need a Service-Linked role in order to get pass the permission issue
I revolved this error, adding autoscaling role on user in KMS
Command to be executed on Target account:
aws kms create-grant --region <Region> --key-id arn:aws:kms:<Region>:111111111111:key/<Key-ID> --grantee-principal arn:aws:iam::999999999999:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling --operations "Encrypt" "Decrypt" "ReEncryptFrom" "ReEncryptTo" "GenerateDataKey" "GenerateDataKeyWithoutPlaintext" "DescribeKey" "CreateGrant"
Region: same region where AMI and ASG to be created
Source account: 111111111111 (AMI from which EBS is encrypted)
Target account: 999999999999 (ASG to be created from Source account AMI)
Snapshot KMS Key: (Go to AMI of Source account and check snapshot)
IAM Role of Target account:
arn:aws:iam::999999999999:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling

Attaching AWS security group to multiple EC2 instances

I am spinning up multiple Amazon EC2 instances and need to attach a Security Group. I am able to achieve it for one EC2 instance but looking for solution for multiple EC2s. I am using TerraForm 0.12. Please let me know how can I use data resource :- data "aws_instances" (s).
Here is the code for single EC2 which i am trying to convert for multiple EC2s:
resource "aws_instance" "ec2_instance" {
count = "${var.ec2_instance_count}"
ami = "${data.aws_ami.app_qrm_ami.id}"
...
}
data "aws_instances" "ec2_instances" {
count = "${var.ec2_instance_count}"
filter {
name = "instance-id"
values = ["${aws_instance.ec2_instance.*.id[count.index]}"]
}
}
resource "aws_network_interface_sg_attachment" "sg_attachment" {
security_group_id = "${data.aws_security_group.security_group.id}"
network_interface_id = "${data.aws_instance.ec2_instance[count.index].network_interface_id}" //facing issues here.
}
I want to achieve this using data "aws_instances" #notice the (s). Thanks in advance.
For removing the Hard coding of ec2 AMI, you can use the following data provider:-
data "aws_ami" "amazon_linux" {
count = "${var.ec2_instance_count}"
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = [
"amzn-ami-hvm-*-x86_64-gp2",
]
}
filter {
name = "owner-alias"
values = [
"amazon",
]
}
}
For rendering the ami id:-
resource "aws_instance" "ec2_instance" {
count = "${var.ec2_instance_count}"
ami = "${data.aws_ami.amazon_linux[count.index].id}"
network_interface =
For getting network_interface_id:-
resource "aws_network_interface" "ec2_nic" {
count = "${var.ec2_instance_count}"
subnet_id = "${aws_subnet.public_a.id}"
private_ips = ["10.0.0.50"]
security_groups = ["${aws_security_group.web.id}"]
attachment {
instance = "${aws_instance.ec2_instance[count.index].id}"
}
}
resource "aws_network_interface_sg_attachment" "sg_attachment" {
security_group_id = "${data.aws_security_group.security_group.id}"
network_interface_id = "${aws_network_interface.ec2_ami[count.index].id}"
}
Thanks Karan, your answer solved the issue for me. Later the infra got fairly complex and I found a different and a smarter way to solve it. I would like to share with other people which might help TF community in future.
multiple internal SG {internal 0-7}and one external to all for creating different groups of swarm which allowed to communicate internal and selectively externally. Majorly used in Microsoft HPC grid.
resource "aws_instance" "ec2_instance" {
count = tonumber(var.mycount)
vpc_security_group_ids = [data.aws_security_group.external_security_group.id, element(data.aws_security_group.internal_security_group.*.id, count.index)]
...
}
resource "aws_security_group" "internal_security_group" {
count = tonumber(var.mycount)
name = "${var.internalSGname}${count.index}"
}
resource "aws_security_group" "external_security_group" {
name = ${var.external_sg_name}"
}

AWS - Configuring Lambda Destinations with SNS

I'm trying to configure an AWS Lambda function to pipe its output into an SNS notification, but it doesn't seem to work. The function executes successfully in the Lambda console and I can see the output is correct, but SNS never seems to be getting notified or publishing anything. I'm working with Terraform to stand up my infra, here is the Terraform code I'm using, maybe someone can help me out:
resource "aws_lambda_function" "lambda_apigateway_to_sns_function" {
filename = "../node/lambda.zip"
function_name = "LambdaPublishToSns"
handler = "index.snsHandler"
role = aws_iam_role.lambda_apigateway_to_sns_execution_role.arn
runtime = "nodejs12.x"
}
resource "aws_iam_role" "lambda_apigateway_to_sns_execution_role" {
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow"
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "apigateway_to_sns_sns_full_access" {
policy_arn = "arn:aws:iam::aws:policy/AmazonSNSFullAccess"
role = aws_iam_role.lambda_apigateway_to_sns_execution_role.name
}
resource "aws_lambda_function_event_invoke_config" "example" {
function_name = aws_lambda_function.lambda_apigateway_to_sns_function.arn
destination_config {
on_success {
destination = aws_sns_topic.sns_topic.arn
}
on_failure {
destination = aws_sns_topic.sns_topic.arn
}
}
}
And here's my Lambda function code (in NodeJS):
exports.snsHandler = (event, context, callback) => {
context.callbackWaitsForEmptyEventLoop = false;
callback(null, {
statusCode: 200,
body: event.body + " apigateway"
);
}
(the function is supposed to take input from API Gateway, and whatever is in the body of the API Gateway request, just append "apigateway" to the end of it and pass the message on; I've tested the integration with API Gateway and that integration works perfectly)
Thanks!

Terraform forces new ec2 resource creation on plan/apply regarding existing security group

I've got a very simple piece of Terraform code:
provider "aws" {
region = "eu-west-1"
}
module ec2 {
source = "./ec2_instance"
name = "EC2 Instance 1"
}
where the module is:
variable "name" {
default = "Default Name from ec2_instance.tf"
}
resource "aws_instance" "example" {
ami = "ami-e5083683"
instance_type = "t2.nano"
subnet_id = "subnet-3e976259"
associate_public_ip_address = true
security_groups = [ "sg-7310e10b" ]
tags {
Name = "${var.name}"
}
}
When I first run it I get this output:
security_groups.#: "" => "1"
security_groups.1642973399: "" => "sg-7310e10b"
However, the next time I try a plan I get:
security_groups.#: "0" => "1" (forces new resource)
security_groups.1642973399: "" => "sg-7310e10b" (forces new resource)
What gives?!
You are incorrectly assigning a vpc_security_group_id into security_groups, instead of into vpc_security_group_ids.
Change
security_groups = [ "sg-7310e10b" ]
to
vpc_security_group_ids = [ "sg-7310e10b" ]
and everything will be ok.

Resources