I have a terraform recipe which seems to be either hanging, or trying to asynchronously do the same thing A LOT of times and getting tripped up.
Here is the main code :
resource "aws_elasticsearch_domain" "es" {
domain_name = "${var.es_domain}"
elasticsearch_version = "6.3"
cluster_config {
instance_type = "t2.medium.elasticsearch"
}
count = "${var.staff_count}"
vpc_options {
subnet_ids = [
"${aws_subnet.public_subnets.*.id[count.index]}"
]
security_group_ids = [
"${aws_security_group.es_sg.id}"
]
}
ebs_options {
ebs_enabled = true
volume_size = 10
}
access_policies = <<CONFIG
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "es:*",
"Principal": "*",
"Effect": "Allow",
"Resource": "arn:aws:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${var.es_domain}/*"
}
]
}
CONFIG
snapshot_options {
automated_snapshot_start_hour = 23
}
tags {
Domain = "${var.es_domain}"
}
}
Here is the "public subnets" code :
resource "aws_subnet" "public_subnets" {
count = "${var.staff_count}"
cidr_block = "${cidrsubnet(var.vpc_cidr, 8, count.index)}"
vpc_id = "${aws_vpc.main.id}"
availability_zone = "${var.region}${var.az}"
tags = "${merge(map("Name", "${var.company_name}-staff-${count.index}-subnet")
, map("kubernetes.io/cluster/staff-${count.index}", "owned"))}"
}
Here is the variable for my domain :
variable "es_domain" {
default = "my-es-domain"
description = "Domain name for elastic search."
}
And I have a staff_count variable which is "8"
Now, I would have expected the result to be that upon running this code, I would get ONE elasticSearch domain, with a subnet for each member of staff...
Now, that doesn't seem to be what is happening, I seem to get all caught up on an infinite loop (or some sort of race condition?) which goes on for over an hour until everything times out.
I get a whole bunch of errors which look exactly like the one below, but with a different number
* aws_elasticsearch_domain.es.3: "arn:aws:es:us-east-1:01043847838460:domain/my-es-domain": Timeout while waiting for the domain to be created
* module.init.aws_elasticsearch_domain.es[0]: 1 error(s) occurred:
Seems like it is trying to do it a whole bunch of times at once, right? If that is the case, I'd really love some guidance on how to fix it, I am new to terraform and am baffled by the syntax.
count = "${var.staff_count}"
This is the source of your multiple clusters. The count property indicates how many to make.
Related
I'm trying to enable Encryption in transit for my environment variable in lambda.
However I couldn't find any possible documentation in terraform to fix this?
I was able to create and attach customer master key in lambda. kms_key_arn
I have created this :
data "aws_kms_ciphertext" "secret_encryption" {
key_id = aws_kms_key.kms_key.key_id
plaintext = <<EOF
{
"token": "${var.token}"
}
EOF
}
now in my lambda's environment variable :
environment {
variables = {
ENV_TOKEN = data.aws_kms_ciphertext.secret_encryption.ciphertext_blob
}
also I attached the kms:decryt to lambda execution role
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": "kms:Decrypt",
"Resource": "arn:aws:kms:XXXX:XXXX:key/1234-567-...."
}
}
In my lambda:
encrypted_token = os.environ["ENV_TOKEN"]
decrypt_github_token = boto3.client('kms').decrypt( CiphertextBlob=base64.b64decode(encrypted_token)
)['Plaintext'].decode('utf-8')
But i'm getting "An error occurred (AccessDeniedException) when calling the Decrypt operation:when calling the Decrypt operation: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access."
does anyone knows where i'm doing wrong.
Should the encryption be only value format not the key value format?
Maybe the error is happening prior to decryption. I wonder if you can't even read the key itself. You can test this by appending "kms:DescribeKey".
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": [
"kms:Decrypt",
"kms:DescribeKey"
],
"Resource": "arn:aws:kms:XXXX:XXXX:key/1234-567-...."
}
}
I haven't been able to find this in the Loki documentation.
Currently, my logs contain a Label resource that has information about the running service. Ideally, I would like to extract some of these values to make the logs very easy to filter by.
This is what the label value looks like:
{"labels":{"configuration_name":"myservicename","location":"region","service_name":"myservicename},"type":""}
I'd like to be able to remap the service_name. I'm using the Helm chart, and setting the values for scrapeConfigs.relabel_configs - I have this working for some basic remaps, however, I haven't been able to remap the JSON values, or even confirm whether that is possible.
Any help on this would be greatly appreciated!
This is for the Node.JS integration of Granfana
To filter an array, you can use array.filter.
For example, if you have an array like so
var names = [
{ "name":"program" },
{ "name":"application"}
]
You can filter my the name like so
var filtered = name.filter(n => n.name == "program")
// filtered = [{"name":"program"}]
Here is a full example:
var logs = [
{
"labels": {
"configuration_name":"serviceA",
"location":"region",
"service_name":"serviceA"
},
"type":""
},
{
"labels": {
"configuration_name":"programB",
"location":"region",
"service_name":"programB"
},
"type":""
},
{
"labels": {
"configuration_name":"systemC",
"location":"region",
"service_name":"systemC"
},
"type":""
}
]
var filteredLogs = logs.filter(l => l.labels.configuration_name == "systemC")
console.log(filteredLogs[0])
I am trying to create a Lambda role and attach it a policy to Allow all ElasticSearch cluster operations.
Below is the code -
resource "aws_iam_role" "lambda_iam" {
name = "lambda_iam"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"es:*"
],
"Effect": "Allow",
"Resource": "*"
}]
}
EOF
}
resource "aws_lambda_function" "developmentlambda" {
filename = "lambda_function.zip"
function_name = "name"
role = "${aws_iam_role.lambda_iam.arn}"
handler = "exports.handler"
source_code_hash = "${filebase64sha256("lambda_function.zip")}"
runtime = "nodejs10.x"
}
I get the following error
Error creating IAM Role lambda_iam: MalformedPolicyDocument: Has prohibited field Resource
The Terraform document regarding Resource says you can specify a "*" for ALL users. The Principal field is not mandatory either so thats not the problem.
I still changed it to be
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "es.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
But that said -
Error creating Lambda function: InvalidParameterValueException: The role defined for the function cannot be assumed by Lambda.
My lambda function definition is simple
resource "aws_lambda_function" "development_lambda" {
filename = "dev_lambda_function.zip"
function_name = "dev_lambda_function_name"
role = "${aws_iam_role.lambda_iam.arn}"
handler = "exports.test"
source_code_hash = "${filebase64sha256("dev_lambda_function.zip")}"
runtime = "nodejs10.x"
}
The lambda file itself has nothing in it but I do not know if that explains the error.
Is there something I am missing here ?
The assume role policy is the role's trust policy (allowing the role to be assumed), not the role's permissions policy (what permissions the role grants to the assuming entity).
A Lambda execution role needs both types of policies.
The immediate error, that the "role defined for the function cannot be assumed by Lambda" is occurring because it needs "Principal": {"Service": "lambda.amazonaws.com"}, not es.amazonaws.com -- that goes in the permissions policy. I don't use terraform, but it looks like that might be resource "aws_iam_policy" based on https://www.terraform.io/docs/providers/aws/r/lambda_function.html, which I assume is the reference you are working from.
I am trying to create a custom waiter to resume a boto3 script when an rds db cluster is restored to a point in time. (I'm trying to adapt this methodology to my needs: https://medium.com/#Kentzo/customizing-botocore-waiters-83badbfd6399) Aside from the thin documentation on custom waiters this seems like it should be straightforward, but I'm having a permissions issue. The EC2 container where I'm running the script has permissions to run rds:DescribeDBClusters and I can make use of the permission in the script like so:
# Check on the cluster
response = rds.describe_db_clusters(
DBClusterIdentifier=db_cluster_identifier,
)
status = response['DBClusters'][0]['Status']
print(status)
available
But when I set up a custom waiter to monitor this I get the following error:
botocore.exceptions.WaiterError: Waiter DbClusterRestored failed: User: arn:aws:sts::123456789012:assumed-role/OrgIamRole/i-1234567890abcdef is not authorized to perform: rds:DescribeDBClusters
Perhaps I'm missing something obvious, but I don't understand why the waiter is missing permissions to do something that the script that created the waiter is allowed to do.
The container permissions look like this:
"OrgIamPolicy": {
"Type": "AWS::IAM::Policy",
"Properties": {
"PolicyName": "OrgIamPolicy",
"Roles": [
{
"Ref": "OrgIamRole"
}
],
"PolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"rds:DescribeDBClusters"
],
"Effect": "Allow",
"Resource": [
"arn:aws:rds:us-east-1:123456789012:*"
]
}
]
}
}
}
And here is my code for restoring the cluster and setting up the waiter:
import boto3
import botocore
import os
import subprocess
rds = boto3.client('rds')
db_cluster_target_instance = 'orgstagingrdsinstance'
db_instance_identifier = 'backupinstance'
db_instance_class = 'db.t2.medium'
target_db_cluster_identifier = "org-backup-cluster"
source_db_cluster_identifier = "org-staging-rds-cluster"
# Create the cluster
response = rds.restore_db_cluster_to_point_in_time(
DBClusterIdentifier=target_db_cluster_identifier,
RestoreType='copy-on-write',
SourceDBClusterIdentifier=source_db_cluster_identifier,
UseLatestRestorableTime=True
)
# Check on the cluster
response = rds.describe_db_clusters(
DBClusterIdentifier=db_cluster_identifier,
)
status = response['DBClusters'][0]['Status']
print(status)
# Create waiter
delay = 10
max_attempts = 30
waiter_name = "DbClusterRestored"
model = botocore.waiter.WaiterModel({
"version": 2,
"waiters": {
"DbClusterRestored": {
"operation": "DescribeDBClusters",
"delay": delay,
"maxAttempts": max_attempts,
"acceptors": [
{
"matcher": "pathAll",
"expected": "available",
"state": "success",
"argument": "DBClusters[].Status"
},
{
"matcher": "pathAll",
"expected": "deleting",
"state": "failure",
"argument": "DBClusters[].Status"
},
{
"matcher": "pathAll",
"expected": "creating",
"state": "failure",
"argument": "DBClusters[].Status"
},
]
}
}
})
waiter = botocore.waiter.create_waiter_with_client(waiter_name, model, rds)
waiter.wait()
Obviously I have this code trimmed and I have obfuscated personal data. Sorry for any errors this might have introduced.
Any help you might give is appreciated.
Okay, the answer to this seems to be pretty simple. The issue is with the scope of the request. The user has permission to run this on the following resource:
"Resource": [
"arn:aws:rds:us-east-1:123456789012:*"
]
When I ran
response = rds.describe_db_clusters(
DBClusterIdentifier=db_cluster_identifier,
)
I was constraining the scope to a cluster that was in arn:aws:rds:us-east-1:123456789012:*. When I ran
waiter = botocore.waiter.create_waiter_with_client(waiter_name, model, rds)
waiter.wait()
I was not passing in that constraint. What I needed to run was
waiter = botocore.waiter.create_waiter_with_client(waiter_name, model, rds)
waiter.wait(DBClusterIdentifier=db_cluster_identifier)
This passed the necessary constraint in and made sure that the permission scope matched the request.
I hope this helps someone in a similar situation.
I'm learning Terraform and I'm trying to get the correct syntax to specify the IAM role permissions for it. I want these capabailities:
Lambda can be invoked from an API Gateway that I also create in Terraform
Lambda can write to Cloudwatch logs
I have the following which allows the API gateway to invoke the Lambda:
resource "aws_iam_role" "my_lambda_execution_role" {
name = "my_lambda_execution_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": [
"lambda.amazonaws.com",
"apigateway.amazonaws.com"
]
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
I have seen the snippet below allows the Lambda to write to CloudWatch. I'm trying to combine these snippets to get all of the permissions but I can't get it right. What is the correct syntax to give all of these permissions to the role?
{
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Effect": "Allow",
"Resource": "arn:aws:logs:*:*:*"
}
]
}
https://www.terraform.io/docs/providers/aws/r/iam_role_policy_attachment.html
You need to create the policy and then attach it to your role. The link above includes a more complete example than on the iam role page.
IAM policy along with role.
# iam
data "aws_iam_policy_document" "policy" {
statement {
sid = ""
effect = "Allow"
principals {
identifiers = ["lambda.amazonaws.com"]
type = "Service"
}
actions = ["sts:AssumeRole"]
}
}
resource "aws_iam_role" "iam_for_lambda" {
name = "iam_for_lambda"
assume_role_policy = "${data.aws_iam_policy_document.policy.json}"
}
resource "aws_iam_role_policy" "frontend_lambda_role_policy" {
name = "frontend-lambda-role-policy"
role = "${aws_iam_role.iam_for_lambda.id}"
policy = "${data.aws_iam_policy_document.lambda_log_and_invoke_policy.json}"
}
data "aws_iam_policy_document" "lambda_log_and_invoke_policy" {
statement {
effect = "Allow"
actions = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
]
resources = ["*"]
}
statement {
effect = "Allow"
actions = ["lambda:InvokeFunction"]
resources = ["arn:aws:lambda:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:function:*"]
}
}
Please find the complete terraform code at my github
In a previous answer I wrote up some background information on how IAM roles work and what an "assume role policy" is. I'm going to assume that background information in this answer.
The policy you've given in your assume_role_policy argument in the resource "aws_iam_role" "my_lambda_execution_role" block is the policy governing which users and services are allowed to "assume" this role. In this case, you are allowing AWS Lambda and Amazon API Gateway to make requests using the privileges granted by this role.
However, by default the role doesn't grant any privileges at all. To address that, we need to attach one or more access policies to the role. The other policy JSON you shared here is an access policy, and to associate it with the role we need to use the aws_iam_role_policy resource type:
resource "aws_iam_role_policy" "logs" {
name = "lambda-logs"
role = aws_iam_role.my_lambda_execution_role.name
policy = jsonencode({
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
],
"Effect": "Allow",
"Resource": "arn:aws:logs:*:*:*",
}
]
})
}
Usually Terraform automatically infers dependencies between resource blocks by noticing references like the aws_iam_role.my_lambda_execution_role expression in the above, and indeed in this case Terraform will determine automatically that it needs to complete the creation of the role before attempting to attach the policy to it.
However, Terraform cannot see automatically here that the policy attachment must complete before the policy itself is operable, and so when you refer to the role from your API Gateway and Lambda resources you must use depends_on to tell Terraform that the policy attachment must complete before the policy will become usable:
resource "aws_lambda_function" "example" {
filename = "${path.module}/example.zip"
function_name = "example"
role = aws_iam_role.my_lambda_execution_role.arn
handler = "example"
# (and any other configuration you need)
# Make sure the role policy is attached before trying to use the role
depends_on = [aws_iam_role_policy.logs]
}
If you don't use depends_on like this, there is a risk that the function will be created and executed before the role attachment is complete, and thus initial executions of your function could fail to write their logs. If your function is not executed immediately after it's created then this probably won't occur in practice, but it's good to include the depends_on to be thorough and to let a future human maintainer know that the role's access policy is also important for the functionality of the Lambda function.