AWS-CDK: Cross account Resource Access and Resource reference - aws-lambda

I have a secret key-value pair in Secrets Manager in Account-1 in us-east-1. This secret is encrypted using a Customer managed KMS key - let's call it KMS-Account-1. All this has been created via console.
Now we turn to CDK. We have cdk.pipelines.CodePipeline which deploys Lambda to multiple stages/environments - so 1st to { Account-2, us-east-1 } then to { Account-3, eu-west-1 } and so on. This has been done.
The lambda code in all stages/environments above, now needs to be changed to use the secret key-value pair present with Account-1's us-east-1 SecretsManager by getting it via secretsmanager client. That code should probably look like this (python):
client = boto3.session.Session().client(
service_name = 'secretsmanager',
region_name = 'us-east-1'
)
resp = client.get_secret_value(
SecretId='arn:aws:secretsmanager:us-east-1:<ACCOUNT-1>:secret:name/of/the/secret'
)
secret = json.loads(resp['SecretString'])
All lambdas in various accounts and regions (ie. environments) will have the exact same code as above since the secret needs to be fetched from Account-1 in us-east-1.
Firstly I hope this is conceptually possible. Is that right?
Next how do I change the cdk code to facilitate this? How will the code-deploy in code-pipeline get permission to import this custom kms key and SecretManager' secretand apply correct permissions for cross account access by the lambdas that the cdk pipeline creates ?
Can someone please give some pointers?

This is bit tricky as CloudFormation, and hence CDK, doesn't allow cross account/cross stage references because CloudFormation export doesn't work cross account as far as my understanding goes. All these patterns of "centralised" resources fall into that category - ie. resource in one account (or a stage in CDK) referenced by other stages.
If the resource is created outside the context of CDK (like via console), then you might as well hardcode the names/arns/etc. throughout the CDK code where its used and that should be sufficient.
For resources that have the ability to hold resource based policies, it's simpler as you can just attach the cross-account access permissions to them directly - again, offline via console since you are maintaining it manually anyway. Each time you add a stage (account) to your pipeline, you will need to go to the resource and add cross-account permissions manually.
For resources that don't have resource based policies, like SSM for eg., things are a bit roundabout as you will need to create a Role that can be assumed cross-account and then access the resource. In that case you will have to separately maintain the IAM Role too and manually update the trust policy to other accounts as you add stages to your CDK pipeline. Then, as usual hardcode the role arn in your CDK code, assume it in some CustomResource lambda and use it.
It gets more interesting if the creation is also done in the CDK code itself (ie. managed by CloudFormation - not done separately via console/aws-cli etc.). In this case, many times you wouldn't "know" the exact ARNs as the physical-id would be generated by CloudFormation and likely be a part of the ARN. Even influencing the physical-id yourself (like by hardcoding the bucket name) might not solve it in all cases. Eg. KMS ARNs and SecretManager ARNs append unique-ids or some sort of hashes to the end of the ARN.
Instead of trying to work all that out, it would be best left untouched and let CFn generate whatever random name/arn it chooses. To then reference these constructs/ARNs, just put them into SSM Parameters in the source/central account. SSM doesn't have resource based policy that I know of. So additionally create a role in cdk that trusts the accounts in your cdk code. Once done, there is no more maintenance - each time you add new environments/accounts to CDK (assuming its a cdk pipeline here), the "loop" construct that you will create will automatically add the new account into the trust relationship.
Now all you need to do is to distribute this role-arn and the SSM Parameternames to other stages. Choose an explicit role-name and SSM Parameters. The manual ARN construction given a rolename is pretty straightforward. So distribute that and SSM Parameters around your CDK code to other stages (compile time strings instead of references). In target stages, create custom-resource(s) (AWSCustomResource) backed by AwsSdkCall lambda to simply assume this role-arn and make the SDK call to retrieve the SSM Parameter values. These values can be anything, like your KMS ARNs, SecretManager's full ARNs etc. which you couldn't easily guess. Now simply use these.
Roundabout way to do a simple thing, but so far that is all I could do to get this to work.
#You need to maintain this list no matter what you do - so it's nothing extra
all_other_accounts = [ <list of accounts that this cdk deploys to> ]
account_principals = [iam.AccountPrincipal(a) for a in all_other_account]
role = iam.Role(
assumed_by = iam.CompositePrincipal(*account_principals), #auto-updated as you change the list above
role_name = some_explicit_name,
...
)
role_arn = f'arn:aws:iam::<account-of-this-stack>:role/{some_explicit_name}'
kms0 = kms.Key(...)
kms0.grant_decrypt(role)
# Because KMS also needs explicit resource policy even if role policy allows access to it
kms0.add_to_role_policy(iam.PolicyStatement(principals = [iam.ArnPrincipal(role_arn)], actions = ...))
kms1 = kms.Key(...)
kms1.grant_decrypt(role)
kms0.add_to_role_policy(... same as above ...)
secrets0 = secretsmanager.Secret(...) #maybe this is based off kms0
secrets0.grant_read(role)
secrets1 = secretsmanager.Secret(...) #maybe this is based off kms1
secrets1.grant_read(role)
# You can turn all this into a loop ofc.
ssm0 = ssm.StingParameter(self, '...', parameter_name = 'kms0_arn', string_value = kms0.key_arn, ...)
ssm0.grant_read(role)
ssm1 = ssm.StingParameter(self, '...', parameter_name = 'kms1_arn', string_value = kms1.key_arn, ...)
ssm1.grant_read(role)
ssm2 = ssm.StingParameter(self, '...', parameter_name = 'secrets0_arn', string_value = secrets0.secret_full_arn, ...)
ssm2.grant_read(role)
...
#Now simply pass around the role and ssm parameter names
for env in environments:
MyApplicationStage(self, <...>, ..., role_arn = role_arn, params = [ 'kms0_arn', 'kms1_arn', ... ], ...)
And then in the target stage(s):
for parm in params:
fn = AwsSdkCall('ssm', 'get_parameter', { "Name": param }, ...)
acr = AwsCustomResource(..., on_create = fn, on_update = fn, ...)
collect['param'] = acr.get_response_field('Parameter.Value')
Now do whatever you want with the collected artifacts, including supplying them as environment variables to your main service lambda (which will be resolved at deploy time).
Remember they will all be Tokens and resolved only at deploy time, but that's true of any resource, whether or not via custom-resource and it shouldn't matter.
That's a generic pattern which should work for any case.
(GitHub link where this question was asked and I had answered it there too)

Related

How to set OpenSearch/Elasticsearch as the destination of a Kinesis Firehose?

I am trying to create Data Stream -> Firehose -> OpenSearch infrastructure using the AWS CDK v2. I was surprised to find that, although OpenSearch is a supported Firehose destination, there is nothing in the CDK to support this use case.
In my CDK Stack I have created an OpenSearch Domain, and am trying to create a Kinesis Firehose DeliveryStream with that domain as the destination. However, kinesisfirehose-destinations package seems to only have a ready-to-use destination for S3 buckets, so there is no obvious way to do this easily using only the constructs supplied by the aws-cdk, not even using the alpha packages.
I think I should be able to write an OpenSearch destination construct by implementing IDestination. I have tried the following simplistic implementation:
import {Construct} from "constructs"
import * as firehose from "#aws-cdk/aws-kinesisfirehose-alpha"
import {aws_opensearchservice as opensearch} from "aws-cdk-lib"
export class OpenSearchDomainDestination implements firehose.IDestination {
private readonly dest: opensearch.Domain
constructor(dest: opensearch.Domain) {
this.dest = dest
}
bind(scope: Construct, options: firehose.DestinationBindOptions): firehose.DestinationConfig {
return {dependables: [this.dest]}
}
}
then I can use it like so,
export class MyStack extends Stack {
...
private createFirehose(input: kinesis.Stream, output: opensearch.Domain) {
const destination = new OpenSearchDomainDestination(output)
const deliveryStream = new firehose.DeliveryStream(this, "FirehoseDeliveryStream", {
destinations: [destination],
sourceStream: input,
})
input.grantRead(deliveryStream)
output.grantWrite(deliveryStream)
}
}
This will compile and cdk synth runs just fine. However, I get the following error when running cdk deploy:
CREATE_FAILED | AWS::KinesisFirehose::DeliveryStream | ... Resource handler returned message: "Exactly one destination configuration is supported for a Firehose
I'm not sure I understand this message but it seems to imply that it will reject outright everything except the one provided S3 bucket destination.
So, my titular question could be answered by the answer to either of these two questions:
How are you supposed to implement bind in IDestination?
Are there any complete working examples of creating a Firehose to OpenSearch using the non-alpha L1 constructs?
(FYI I have also asked this question on the AWS forum but have not yet received an answer.)
Other destinations (at the moment) than S3 are not supported by the L2 constructs. This is described at https://docs.aws.amazon.com/cdk/api/v1/docs/aws-kinesisfirehose-readme.html
In such cases, I go to the source code to see what can be done. See https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-kinesisfirehose/lib/destination.ts . There is no easy way how to inject other destination than S3 since the DestinationConfig does not support it. You can see at https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-kinesisfirehose-destinations/lib/s3-bucket.ts how the config for S3 is crafted. And you can see how that config is used to translate to L1 construct CfnDeliveryStream at https://github.com/aws/aws-cdk/blob/f82d96bfed427f8e49910ac7c77004765b2f5f6c/packages/%40aws-cdk/aws-kinesisfirehose/lib/delivery-stream.ts#L364
Probably easiest way at the moment is to write down your L1 constructs to define destination as OpenSearch.

How to handle weird API flow with implicit create step in custom terraform provider

Most terraform providers demand a predefined flow, Create/Read/Update/Delete/Exists
I am in a weird situation developing a provider against an API where this behavior diverges a bit.
There are two kinds of resources, Host and Scope. A host can have many scopes. Scopes are updated with configurations.
This generally fits well into the terraform flow, it has a full CRUDE flow possible - except for one instance.
When a new Host is made, it automatically has a default scope attached to it. It is always there, cannot be deleted etc.
I can't figure out how to have my provider gracefully handle this, as I would want the tf to treat it like any other resource, but it doesn't have an explicit CREATE/DELETE, only READ/UPDATE/EXISTS - but every other scope attached to the host would have CREATE/DELETE.
Importing is not an option due to density, requiring an import for every host would render the entire thing pointless.
I originally was going to attempt to split Scopes and Configurations into separate resources so one could be full-filled by the Host (the host providing the Scope ID for a configuration, and then other configurations can get their scope IDs from a scope resource)
However this approach falls apart because the API for both are the same, unless I wanted to add the abstraction of creating an empty scope then applying a configuration against it, which may not be fully supported. It would essentially be two resources controlling one resource which could lead to dramatic conflicts.
A paraphrased example of an execution I thought about implementing
resource "host" "test_integrations" {
name = "test.integrations.domain.com"
account_hash = "${local.integrationAccountHash}"
services = [40]
}
resource "configuration" "test_integrations_root_configuration" {
name = "root"
parent_host = "${host.test_integrations.id}"
account_hash = "${local.integrationAccountHash}"
scope_id = "${host.test_integrations.root_scope_id}"
hostnames = ["test.integrations.domain.com"]
}
resource "scope" "test_integrations_other" {
account_hash = "${local.integrationAccountHash}"
host_hash = "${host.test_integrations.id}"
path = "/non/root/path"
name = "Some Other URI Path"
}
resource "configuration" "test_integrations_other_configuration" {
name = "other"
parent_host = "${host.test_integrations.id}"
account_hash = "${local.integrationAccountHash}"
scope_id = "${host.test_integrations_other.id}"
}
In this example flow, a configuration and scope resource unfortunately are pointing to the same resource which I am worried would cause conflicts or confusion on who is responsible for what and dramatically confuses the create/delete lifecycle
But I can't figure out how the TF lifecycle would allow for a resource that would only UPDATE/READ/EXISTS if say a flag was given (and how state would handle that)
An alternative would be to just have a Configuration resource, but then if it was the root configuration it would need to skip create/delete as it is inherently tied to the host
Ideally I'd be able to handle this situation gracefully. I am trying to avoid including the root scope/configuration in the host definition as it would create a split in how they are written and handled.
The documentation for providers implies you can use a resource AS a schema object in a resource, but does not explain how or why. If it works the way I imagine it, it may work to create a resource that is only used to inject into the host perhaps - but I don't know if that is how it works and if it is how to accomplish it.
I believe I tentatively have found a solution after asking some folks on the gopher slack.
Using AWS Provider Default VPC as a reference, I can "clone" the resource into one with a custom Create/Delete lifecycle
Loose Example:
func defaultResourceConfiguration() *schema.Resource {
drc := resourceConfiguration()
drc.Create = resourceDefaultConfigurationCreate
drc.Delete = resourceDefaultConfigurationDelete
return drc
}
func resourceDefaultConfigurationCreate(d *schema.ResourceData, m interface{}) error {
// double check it exists and update the resource instead
return resourceConfigurationUpdate(d, m)
}
func resourceDefaultConfigurationDelete(d *schema.ResourceData, m interface{}) error {
log.Printf("[WARN] Cannot destroy Default Scope Configuration. Terraform will remove this resource from the state file, however resources may remain.")
return nil
}
This should allow me to provide an identical resource that is designed to interact with the already existing one created by its parent host.

terraform destroy doesn't delete the ec2 instance created using input parameters for variables

I tried launching an ec2 instance using input parameters for the variables in terraform apply command. This creates the instance successfully. However, when I try to delete the instance using terraform destory, it executes but nothing gets deleted.
So I have a region variable with a default value. When I pass a different region in this variable using input parameters,instance launchesjust fine in the provided region but I am not able to terminate it using terraform destroy.
main.tf
variable "region" {
default = "us-west-1"
}
variable "ami" {
type = "map"
default = {
us-east-2 = "ami-02e680c4540db351e"
us-west-1 = "ami-011b6930a81cd6aaf"
}
}
provider "aws" {
region = "${var.region}"
}
resource "aws_instance" "web" {
ami = "${lookup(var.ami,var.region)}"
instance_type = "t2.micro"
tags {
Name = "naxi"
}
}
Terraform apply:
terraform apply -var region=us-east-2
Output of terraform destroy :
aws_instance.web: Refreshing state... (ID: i-05ca0514f61dcaf16)
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
Destroy complete! Resources: 0 destroyed.
Though it's able to lookup the instance id in the correct region, my guess is that it is trying to terminate the instance from the default region and not from the one I supplied as parameter.
Is there a way I can supply a parameter -var region=something with terraform destroy?
Destroy works as expected if I use the default values and no input parameters.
EDIT---
As soon as I give the this command: terraform destroy -varfile=variables.tfvars, all the instance related information from terraform.tfstate file gets removed and all the previous content of this file gets saved as backup to terraform.tfstate.backup. But still the instance is not deleted.
I think this is your main problem:
You ran apply with your "aws" provider defined one way (via a variable), but then you ran destroy with the same "aws" provider defined differently (you let the "region" variable default instead of specifying it).
As a result, terraform destroy looked in the wrong place (wrong AWS region) for your created resources.
Since terraform destroy was looking in the wrong place, it found nothing there.
Therefore terraform destroy saw that it did not need to destroy anything, just update its locally stored state information to reflect the absence of the resources.
Try these steps instead:
terraform apply -var 'region=us-east-2'
terraform destroy -var 'region=us-east-2'
This works for me, Terraform v0.12.2 + provider.aws v2.16.0.
I am guessing slightly here, but it seems like the point is probably that you, the Terraform user, are responsible for making sure you destroy with the exact same provider definitions you apply'd with.
And if you're using any variables to help define your providers, then this is something you will need to be especially mindful of, since you are making it easy to accidentally change provider definitions.
As a side note, I ran into a similar confusion myself. It seems to me that HashiCorp's Getting Started guide, in its current state, could do a better job of warning about this. It walks newbies through a very similar setup to yours, and currently appears to say nothing about how to destroy properly, or any potential pitfalls.
Perhaps you have multiple providers set. Try aliasing your provider and passing that into the resource.
provider "aws" {
region = var.region
alias = "mine"
}
resource "aws_instance" "web" {
provider = aws.mine
}

Puppet: Making a custom function depend on a resource

I have a Puppet custom function that returns information about a user defined in OpenStack's Keystone identity service. Usage is something along the lines of:
$tenant_id = lookup_tenant_by_name($username, $password, "mytenant")
The problem is that the credentials used in this query ($username) are supposed to be created by another resource during the Puppet run (a Keystone_user resource from puppet-keystone). As far as I can tell, the call to the lookup_tenant_by_name function is being evaluated before any resource ordering happens, because no amount of dependencies in the calling code is able to force the credentials to be created prior to this function being executed.
In general, it is possible to write custom functions -- or place them appropriately in a manifest -- such that they will not be executed by Puppet until after some specified resource has been instantiated?
Short answer: You cannot make your manifest's behavior depend on resources declared inside of it.
Long answer: Parser functions are called during the compilation phase (on the master if you use one, or the agent if you use puppet apply). In neither case can it ever run before any resource is synced, because that will happen after the compiler has done all its work (including invocation of your functions).
To query information from the agent machine, you generally want to use custom facts. Still, those will be populated before even the compiler run.
Likely the best approach in this situation is to make the manifest tolerate the absence of the information, so that anything that depends on the value that your lookup_tenant_by_name function returns will only be evaluated if that value is available. This will usually be during the second Puppet run.
if $tenant_id == "" {
notify { "cannot yet find tenant $username": }
}
else {
# your code using the tenant ID
}

Updating an AutoScalingGroup with a new LaunchConfiguration in boto

I have a script that needs to update a named AutoScalingGroup with a new LaunchConfiguration for some new just-created AMI. Unfortunately the documentation isn't good, and I'm tired of trial-and-error. This is what I have so far:
build_autoscale_name = "build_autoscaling"
build_autoscale_lc = LaunchConfiguration(
...launch config stuff...
, image_id=imid # new AMI
)
as_conn.create_launch_configuration(build_autoscale_lc)
ag = AutoScalingGroup(
group_name=build_autoscale_name
, launch_config=build_autoscale_lc
...other ASG stuff...
)
as_conn.create_auto_scaling_group(ag)
The latest way this is failing is with:
Launch Configuration by this name already exists
If I comment out the create_launch_configuration() I then get:
AutoScalingGroup by this name already exists
I see AutoScalingGroup has an update method; do I need to perhaps get_all_groups() then do update with a new LaunchConfiguration with the same name? Or does it matter if I create a newly-named LaunchConfiguration every time (i.e. will I run into some limit)?
I was experiencing a similar problem, when trying to update an existing autoscaling group, and managed to sort it out with the the process you suggested in your original post: using get_all_groups() to fetch the autoscaling group, and then calling update() on the object after updating the attributes.
Full example:
autoscaling_group_name = 'my-test-asg'
launch_config_name = 'my-test-lc'
launch_config = LaunchConfiguration( name=launch_config_name,
image_id=image_id,
key_name=ssh_key_name,
security_groups=security_groups,
user_data=user_data,
instance_type=instance_type,
associate_public_ip_address=associate_public_ip )
as_group = as_conn.get_all_groups(names=[autoscaling_group_name])[0]
setattr(as_group, launch_config_name, launch_config)
as_group.update()
I am not familiar with boto, but I can clear few doubts about autoscaling in AWS. To update launch configuration of an autoscaling group you will have to create a new launch configuration and update the launch config for autoscaling group. You can either keep two names for launchconfig. So if the first name is in use then delete the launch config with the second name and create a new one with the second name after that update autoscaling group and same if launchconfig in use has the second name. So, you will have only two launch configs at a time.
Hope I have understood you problem correctly.

Resources