Boto3 Custom waiter rejected for not having a resource permission - amazon-ec2

I am trying to create a custom waiter to resume a boto3 script when an rds db cluster is restored to a point in time. (I'm trying to adapt this methodology to my needs: https://medium.com/#Kentzo/customizing-botocore-waiters-83badbfd6399) Aside from the thin documentation on custom waiters this seems like it should be straightforward, but I'm having a permissions issue. The EC2 container where I'm running the script has permissions to run rds:DescribeDBClusters and I can make use of the permission in the script like so:
# Check on the cluster
response = rds.describe_db_clusters(
DBClusterIdentifier=db_cluster_identifier,
)
status = response['DBClusters'][0]['Status']
print(status)
available
But when I set up a custom waiter to monitor this I get the following error:
botocore.exceptions.WaiterError: Waiter DbClusterRestored failed: User: arn:aws:sts::123456789012:assumed-role/OrgIamRole/i-1234567890abcdef is not authorized to perform: rds:DescribeDBClusters
Perhaps I'm missing something obvious, but I don't understand why the waiter is missing permissions to do something that the script that created the waiter is allowed to do.
The container permissions look like this:
"OrgIamPolicy": {
"Type": "AWS::IAM::Policy",
"Properties": {
"PolicyName": "OrgIamPolicy",
"Roles": [
{
"Ref": "OrgIamRole"
}
],
"PolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"rds:DescribeDBClusters"
],
"Effect": "Allow",
"Resource": [
"arn:aws:rds:us-east-1:123456789012:*"
]
}
]
}
}
}
And here is my code for restoring the cluster and setting up the waiter:
import boto3
import botocore
import os
import subprocess
rds = boto3.client('rds')
db_cluster_target_instance = 'orgstagingrdsinstance'
db_instance_identifier = 'backupinstance'
db_instance_class = 'db.t2.medium'
target_db_cluster_identifier = "org-backup-cluster"
source_db_cluster_identifier = "org-staging-rds-cluster"
# Create the cluster
response = rds.restore_db_cluster_to_point_in_time(
DBClusterIdentifier=target_db_cluster_identifier,
RestoreType='copy-on-write',
SourceDBClusterIdentifier=source_db_cluster_identifier,
UseLatestRestorableTime=True
)
# Check on the cluster
response = rds.describe_db_clusters(
DBClusterIdentifier=db_cluster_identifier,
)
status = response['DBClusters'][0]['Status']
print(status)
# Create waiter
delay = 10
max_attempts = 30
waiter_name = "DbClusterRestored"
model = botocore.waiter.WaiterModel({
"version": 2,
"waiters": {
"DbClusterRestored": {
"operation": "DescribeDBClusters",
"delay": delay,
"maxAttempts": max_attempts,
"acceptors": [
{
"matcher": "pathAll",
"expected": "available",
"state": "success",
"argument": "DBClusters[].Status"
},
{
"matcher": "pathAll",
"expected": "deleting",
"state": "failure",
"argument": "DBClusters[].Status"
},
{
"matcher": "pathAll",
"expected": "creating",
"state": "failure",
"argument": "DBClusters[].Status"
},
]
}
}
})
waiter = botocore.waiter.create_waiter_with_client(waiter_name, model, rds)
waiter.wait()
Obviously I have this code trimmed and I have obfuscated personal data. Sorry for any errors this might have introduced.
Any help you might give is appreciated.

Okay, the answer to this seems to be pretty simple. The issue is with the scope of the request. The user has permission to run this on the following resource:
"Resource": [
"arn:aws:rds:us-east-1:123456789012:*"
]
When I ran
response = rds.describe_db_clusters(
DBClusterIdentifier=db_cluster_identifier,
)
I was constraining the scope to a cluster that was in arn:aws:rds:us-east-1:123456789012:*. When I ran
waiter = botocore.waiter.create_waiter_with_client(waiter_name, model, rds)
waiter.wait()
I was not passing in that constraint. What I needed to run was
waiter = botocore.waiter.create_waiter_with_client(waiter_name, model, rds)
waiter.wait(DBClusterIdentifier=db_cluster_identifier)
This passed the necessary constraint in and made sure that the permission scope matched the request.
I hope this helps someone in a similar situation.

Related

is there any possibility in terraform to enable Encryption in transit

I'm trying to enable Encryption in transit for my environment variable in lambda.
However I couldn't find any possible documentation in terraform to fix this?
I was able to create and attach customer master key in lambda. kms_key_arn
I have created this :
data "aws_kms_ciphertext" "secret_encryption" {
key_id = aws_kms_key.kms_key.key_id
plaintext = <<EOF
{
"token": "${var.token}"
}
EOF
}
now in my lambda's environment variable :
environment {
variables = {
ENV_TOKEN = data.aws_kms_ciphertext.secret_encryption.ciphertext_blob
}
also I attached the kms:decryt to lambda execution role
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": "kms:Decrypt",
"Resource": "arn:aws:kms:XXXX:XXXX:key/1234-567-...."
}
}
In my lambda:
encrypted_token = os.environ["ENV_TOKEN"]
decrypt_github_token = boto3.client('kms').decrypt( CiphertextBlob=base64.b64decode(encrypted_token)
)['Plaintext'].decode('utf-8')
But i'm getting "An error occurred (AccessDeniedException) when calling the Decrypt operation:when calling the Decrypt operation: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access."
does anyone knows where i'm doing wrong.
Should the encryption be only value format not the key value format?
Maybe the error is happening prior to decryption. I wonder if you can't even read the key itself. You can test this by appending "kms:DescribeKey".
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": [
"kms:Decrypt",
"kms:DescribeKey"
],
"Resource": "arn:aws:kms:XXXX:XXXX:key/1234-567-...."
}
}

(InvalidRequestException) when calling the GetQueryResults..... Querying Athena From Lambda Python.... Cannot Read Results

I have been trying to query Athena from my lambda function (Python3.8) but I keep getting the same error although tried adding an if else statement to check the status of the execution and i always the same error on the aws console and the cli locally
Here is the lambda function:
import json
import boto3
import time
def function(event, context):
client=boto3.client('athena')
#setup and perform query
queryStart=client.start_query_execution(
QueryString = 'SELECT * FROM my_s3_bucket_developer limit 8;',
QueryExecutionContext = {
'Database':'mydb'
},
ResultConfiguration = {
'OutputLocation': 's3://athena-results-queries-developer/'
}
)
#get query ID
queryId= queryStart['QueryExecutionId']
#we gonna sleep the function now because we don't know how
#long it will take to execute the query
time.sleep(25)
results=client.get_query_results(QueryExecutionId = queryId)
for row in results['ResultSet']['Rows']:
print(row)
and this is the IAM Role I have attached to my lambda function:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"S3:GetBucketLocation",
"S3:GetObject"
],
"Resource": [
"arn:aws:s3:::athena-results-queries-developer/*",
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:StopQueryExecution",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"glue:GetTable"
],
"Resource": "*"
}
]
}
This is the error I keep getting in the logs
An error occurred (InvalidRequestException) when calling the GetQueryResults operation: Query did not finish successfully. Final query state: FAILED
"errorType": "InvalidRequestException"
"stackTrace": [
[
"/var/task/lambda_function.py",
26,
"function",
"results=client.get_query_results(QueryExecutionId = queryId)"
],
[
"/var/runtime/botocore/client.py",
316,
"_api_call",
"return self._make_api_call(operation_name, kwargs)"
],
[
"/var/runtime/botocore/client.py",
626,
"_make_api_call",
"raise error_class(parsed_response, operation_name)"
]
]
}
If anyone can help me would really appreciate it - I have been trying to solve this for days
The problem is that you don't wait for the query to complete properly. You need to call get_query_execution and check that the query has succeeded before you call get_query_results.
There's a full example here that you could take inspiration from: https://www.ilkkapeltola.fi/2018/04/simple-way-to-query-amazon-athena-in.html

Correct terraform syntax for adding permissions to AWS Lambda

I'm learning Terraform and I'm trying to get the correct syntax to specify the IAM role permissions for it. I want these capabailities:
Lambda can be invoked from an API Gateway that I also create in Terraform
Lambda can write to Cloudwatch logs
I have the following which allows the API gateway to invoke the Lambda:
resource "aws_iam_role" "my_lambda_execution_role" {
name = "my_lambda_execution_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": [
"lambda.amazonaws.com",
"apigateway.amazonaws.com"
]
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
I have seen the snippet below allows the Lambda to write to CloudWatch. I'm trying to combine these snippets to get all of the permissions but I can't get it right. What is the correct syntax to give all of these permissions to the role?
{
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Effect": "Allow",
"Resource": "arn:aws:logs:*:*:*"
}
]
}
https://www.terraform.io/docs/providers/aws/r/iam_role_policy_attachment.html
You need to create the policy and then attach it to your role. The link above includes a more complete example than on the iam role page.
IAM policy along with role.
# iam
data "aws_iam_policy_document" "policy" {
statement {
sid = ""
effect = "Allow"
principals {
identifiers = ["lambda.amazonaws.com"]
type = "Service"
}
actions = ["sts:AssumeRole"]
}
}
resource "aws_iam_role" "iam_for_lambda" {
name = "iam_for_lambda"
assume_role_policy = "${data.aws_iam_policy_document.policy.json}"
}
resource "aws_iam_role_policy" "frontend_lambda_role_policy" {
name = "frontend-lambda-role-policy"
role = "${aws_iam_role.iam_for_lambda.id}"
policy = "${data.aws_iam_policy_document.lambda_log_and_invoke_policy.json}"
}
data "aws_iam_policy_document" "lambda_log_and_invoke_policy" {
statement {
effect = "Allow"
actions = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
]
resources = ["*"]
}
statement {
effect = "Allow"
actions = ["lambda:InvokeFunction"]
resources = ["arn:aws:lambda:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:function:*"]
}
}
Please find the complete terraform code at my github
In a previous answer I wrote up some background information on how IAM roles work and what an "assume role policy" is. I'm going to assume that background information in this answer.
The policy you've given in your assume_role_policy argument in the resource "aws_iam_role" "my_lambda_execution_role" block is the policy governing which users and services are allowed to "assume" this role. In this case, you are allowing AWS Lambda and Amazon API Gateway to make requests using the privileges granted by this role.
However, by default the role doesn't grant any privileges at all. To address that, we need to attach one or more access policies to the role. The other policy JSON you shared here is an access policy, and to associate it with the role we need to use the aws_iam_role_policy resource type:
resource "aws_iam_role_policy" "logs" {
name = "lambda-logs"
role = aws_iam_role.my_lambda_execution_role.name
policy = jsonencode({
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
],
"Effect": "Allow",
"Resource": "arn:aws:logs:*:*:*",
}
]
})
}
Usually Terraform automatically infers dependencies between resource blocks by noticing references like the aws_iam_role.my_lambda_execution_role expression in the above, and indeed in this case Terraform will determine automatically that it needs to complete the creation of the role before attempting to attach the policy to it.
However, Terraform cannot see automatically here that the policy attachment must complete before the policy itself is operable, and so when you refer to the role from your API Gateway and Lambda resources you must use depends_on to tell Terraform that the policy attachment must complete before the policy will become usable:
resource "aws_lambda_function" "example" {
filename = "${path.module}/example.zip"
function_name = "example"
role = aws_iam_role.my_lambda_execution_role.arn
handler = "example"
# (and any other configuration you need)
# Make sure the role policy is attached before trying to use the role
depends_on = [aws_iam_role_policy.logs]
}
If you don't use depends_on like this, there is a risk that the function will be created and executed before the role attachment is complete, and thus initial executions of your function could fail to write their logs. If your function is not executed immediately after it's created then this probably won't occur in practice, but it's good to include the depends_on to be thorough and to let a future human maintainer know that the role's access policy is also important for the functionality of the Lambda function.

Assigned function policy to lambda which allows all CloudWatch Events rule to invoke lambda?

I used the above CLI command but got an error in the console, please find the attached screenshot of the error
Please find below function policy of lambda:
{ "Version": "2012-10-17", "Id": "default", "Statement": [
{
"Sid": "events-access",
"Effect": "Allow",
"Principal": {
"Service": "events.amazonaws.com"
},
"Action": "lambda:InvokeFunction",
"Resource": "arn:aws:lambda:us-east-1:096280016729:function:leto_debug_log",
"Condition": {
"ArnLike": {
"AWS:SourceArn": "arn:aws:events:us-east-1:096280016729:rule/*"
}
}
} ] }
I followed the answer from the below link but still got an error:
Allow all cloudwatch event rules to have access to lambda function
Perhaps a clue to this, is that a CloudWatch Event rule name of * does not appear to be valid. For example, if you try to delete this rule in the AWS lambda console area, you will get an error like this on the trigger UI area:
It would be nice if this approach was formally supported in some way, but I don't think it is. idk

Terraform recipe to create elasticsearch domain seems to be hanging

I have a terraform recipe which seems to be either hanging, or trying to asynchronously do the same thing A LOT of times and getting tripped up.
Here is the main code :
resource "aws_elasticsearch_domain" "es" {
domain_name = "${var.es_domain}"
elasticsearch_version = "6.3"
cluster_config {
instance_type = "t2.medium.elasticsearch"
}
count = "${var.staff_count}"
vpc_options {
subnet_ids = [
"${aws_subnet.public_subnets.*.id[count.index]}"
]
security_group_ids = [
"${aws_security_group.es_sg.id}"
]
}
ebs_options {
ebs_enabled = true
volume_size = 10
}
access_policies = <<CONFIG
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "es:*",
"Principal": "*",
"Effect": "Allow",
"Resource": "arn:aws:es:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:domain/${var.es_domain}/*"
}
]
}
CONFIG
snapshot_options {
automated_snapshot_start_hour = 23
}
tags {
Domain = "${var.es_domain}"
}
}
Here is the "public subnets" code :
resource "aws_subnet" "public_subnets" {
count = "${var.staff_count}"
cidr_block = "${cidrsubnet(var.vpc_cidr, 8, count.index)}"
vpc_id = "${aws_vpc.main.id}"
availability_zone = "${var.region}${var.az}"
tags = "${merge(map("Name", "${var.company_name}-staff-${count.index}-subnet")
, map("kubernetes.io/cluster/staff-${count.index}", "owned"))}"
}
Here is the variable for my domain :
variable "es_domain" {
default = "my-es-domain"
description = "Domain name for elastic search."
}
And I have a staff_count variable which is "8"
Now, I would have expected the result to be that upon running this code, I would get ONE elasticSearch domain, with a subnet for each member of staff...
Now, that doesn't seem to be what is happening, I seem to get all caught up on an infinite loop (or some sort of race condition?) which goes on for over an hour until everything times out.
I get a whole bunch of errors which look exactly like the one below, but with a different number
* aws_elasticsearch_domain.es.3: "arn:aws:es:us-east-1:01043847838460:domain/my-es-domain": Timeout while waiting for the domain to be created
* module.init.aws_elasticsearch_domain.es[0]: 1 error(s) occurred:
Seems like it is trying to do it a whole bunch of times at once, right? If that is the case, I'd really love some guidance on how to fix it, I am new to terraform and am baffled by the syntax.
count = "${var.staff_count}"
This is the source of your multiple clusters. The count property indicates how many to make.

Resources