Hide or encrypt credentials information in AWS Data pipeline - amazon-data-pipeline

I am creating an AWS data-pipeline to copy data from mysql to S3. I have written a shell script which accepts credentials as arguments and creates the pipeline so that my credentials are not exposed in script.
used below bash shell script to create pipeline.
unique_id="$(date +'%s')"
profile="${4}"
startDate="${1}"
echo "{\"values\":{\"myS3CopyStartDate\":\"$startDate\",\"myRdsUsername\":\"$2\",\"myRdsPassword\":\"$3\"}}" > mysqlToS3values.json
sqlpipelineId=`aws datapipeline create-pipeline --name mysqlToS3 --unique-id mysqlToS3_$unique_id --profile $profile --query '{ID:pipelineId}' --output text`
validationErrors=`aws datapipeline put-pipeline-definition --pipeline-id $sqlpipelineId --pipeline-definition file://mysqlToS3.json --parameter-objects file://mysqlToS3Parameters.json --parameter-values-uri file://mysqlToS3values.json --query 'validationErrors' --profile $profile`
aws datapipeline activate-pipeline --pipeline-id $sqlpipelineId --profile $profile
However when I fetch pipeline definition through aws cli using
aws datapipeline get-pipeline-definition --pipeline-id 27163782,
I get my credentials in plain text in json output.
{ "parameters": [...], "objects": [...], "values": { "myS3CopyStartDate": "2018-04-05T10:00:00", "myRdsPassword": "sbc", "myRdsUsername": "ksnck" } }
Is there any way to encrypt or hide the credentials information?

I don't think there is a way to mask the data in the pipeline definition.
The strategy I have used is to store my secrets in S3 (encrypted with a specific KMS key and using appropriate IAM/bucket permisions). Then, inside my datapipeline step, I use the AWS CLI to read the secret from S3 and pass it to the mysql command or whatever.
So instead of having a pipeline parameter like myRdsPassword I have:
"myRdsPasswordFile": "s3://mybucket/secrets/rdspassword"
Then inside my step I read it with something like:
PWD=$(aws s3 cp ${myRdsPasswordFile} -)
You could also have a similar workflow that retrieves the password from AWS Parameter Store instead of S3.

There is actually a way that's built into data pipelines:
You prepend the field with an * and it will encrypt the field and hide it visibly like a password form field.
If you're using parameters, then prepend the * on both the object field and the corresponding parameter field like so (note - there are three * with a parameterized setup; the example below is just a sample - missing required fields just to simplify and illustrate how to handle the encryption through parameters):
...{
"*password": "#{*myDbPassword}",
"name": "DBName",
"id": "DB",
},
],
"parameters": [
{
"id": "*myDbPassword",
"description": "Database password",
"type": "String"
}...
See more below:
https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-pipeline-characters.html

You can store RDS Credentials in AWS Secret Manager. You can then retrieve the credentials from SecretManager in the data-pipeline using cloudformation template as described below:
Mappings:
RegionToDatabaseConfig:
us-west-2:
CredentialsSecretKey: us-west-2-SECRET_NAME
# ...
us-east-1:
CredentialsSecretKey: us-east-1-SECRET_NAME
# ...
eu-west-1:
CredentialsSecretKey: eu-west-1-SECRET_NAME
# ...
Resources:
OurProjectDataPipeline:
Type: AWS::DataPipeline::Pipeline
Properties:
# ...
PipelineObjects:
# ...
# RDS resources
- Id: PostgresqlDatabase
Name: Source database to sync data from
Fields:
- Key: type
StringValue: RdsDatabase
- Key: username
StringValue:
!Join
- ''
- - '{{resolve:secretsmanager:'
- !FindInMap
- RegionToDatabaseConfig
- {Ref: 'AWS::Region'}
- CredentialsSecretKey
- ':SecretString:username}}'
- Key: "*password"
StringValue:
!Join
- ''
- - '{{resolve:secretsmanager:'
- !FindInMap
- RegionToDatabaseConfig
- {Ref: 'AWS::Region'}
- CredentialsSecretKey
- ':SecretString:password}}'
- Key: jdbcProperties
StringValue: 'allowMultiQueries=true'
- Key: rdsInstanceId
StringValue:
!FindInMap
- RegionToDatabaseConfig
- {Ref: 'AWS::Region'}
- RDSInstanceId

Related

Pass a CloudFormation YAML list via a JSON string parameter

I am attempting to import an existing load balancer into a CloudFormation stack. The listeners must be specified as a YAML list, but there is no CloudFormation parameter type for list (array) or object, so the parameter for the YAML list must be a string. This is causing the following CloudFormation error
Value of property Listeners must be of type List
The value of the string parameter for the listeners is set using the CLI -
aws elb describe-load-balancers --load-balancer-names $ELB_DNS_NAME --query 'LoadBalancerDescriptions[0].ListenerDescriptions[].Listener' | jq --compact-output '.' | sed -e 's/"/\\"/g'
Notice that the resultant JSON from the above command is escaped. I suspect that this is the root cause of the issue.
[
...
{
"ParameterKey": "ElbListeners",
"ParameterValue": "[{\"Protocol\":\"TCP\",\"LoadBalancerPort\":443,\"InstanceProtocol\":\"TCP\",\"InstancePort\":31672},{\"Protocol\":\"TCP\",\"LoadBalancerPort\":80,\"InstanceProtocol\":\"TCP\",\"InstancePort\":30545}]"
},
...
]
CloudFormation doesn't seem to offer any way of un-escaping the string parameter, so the following template fails.
AWSTemplateFormatVersion: 2010-09-09
Resources:
...
IngressLoadBalancer:
Type: AWS::ElasticLoadBalancing::LoadBalancer
DeletionPolicy: Delete
Properties:
Listeners: !Ref ElbListeners
LoadBalancerName: !Ref ElbName
Parameters:
...
ElbListeners:
Type: String
Description: Listeners for the load balancer
Default: ""
ElbName:
Type: String
Description: Name of the load balancer
Default: ""
Replacing quotes in the resultant JSON with ${quote} in the parameters file, and then replacing ${quote} with quotes using !Sub fails. It seems that the first input for !Sub can't be !Ref ParameterName.
I don't know how many listeners there will be, so it's not feasible to hardcode a list of listeners in the template and pass in multiple parameters for the ports/protocols.
How can I pass a YAML list as a JSON string parameter?
You can take the content of the ElbListeners parameter and simply insert it into the template, removing it from your Parameters. The resulting template would look like:
AWSTemplateFormatVersion: 2010-09-09
Resources:
...
IngressLoadBalancer:
Type: AWS::ElasticLoadBalancing::LoadBalancer
DeletionPolicy: Delete
Properties:
Listeners:
- Protocol: TCP
LoadBalancerPort: 443
InstanceProtocol: TCP
InstancePort: 31672
- Protocol: TCP
LoadBalancerPort: 80
InstanceProtocol: TCP
InstancePort: 30545
LoadBalancerName: !Ref ElbName
Parameters:
...
ElbName:
Type: String
Description: Name of the load balancer
Default: ""

How do I access the Cognito UserPoolClient Secret in Lambda function?

I have created Cognito UserPool and UserpoolClient via Resources in serverless.yml file like this -
CognitoUserPool:
Type: AWS::Cognito::UserPool
Properties:
AccountRecoverySetting:
RecoveryMechanisms:
- Name: verified_email
Priority: 2
UserPoolName: ${self:provider.stage}-user-pool
UsernameAttributes:
- email
MfaConfiguration: OFF
Policies:
PasswordPolicy:
MinimumLength: 8
RequireLowercase: True
RequireNumbers: True
RequireSymbols: True
RequireUppercase: True
CognitoUserPoolClient:
Type: AWS::Cognito::UserPoolClient
Properties:
ClientName: ${self:provider.stage}-user-pool-client
UserPoolId:
Ref: CognitoUserPool
ExplicitAuthFlows:
- ALLOW_USER_PASSWORD_AUTH
- ALLOW_REFRESH_TOKEN_AUTH
GenerateSecret: true
Now I can pass the Userpool and UserpoolClient as environment variables to the lambda functions like this -
my_function:
package: {}
handler:
events:
- http:
path:<path>
method: post
cors: true
environment:
USER_POOL_ID: !Ref CognitoUserPool
USER_POOL_CLIENT_ID: !Ref CognitoUserPoolClient
I can access these IDs in my code as -
USER_POOL_ID = os.environ['USER_POOL_ID']
USER_POOL_CLIENT_ID = os.environ['USER_POOL_CLIENT_ID']
I have printed the values and they are being printed correctly. However, UserpoolClient also generates one AppClient secret which I need to use while generating secret hash. How shall I access app client secret (UserpoolClient's secret) in my lambda?
Probably now what you hoped for, but you cannot export client secret in CloudFormation explicitly. Take a look at the return values from AWS::Cognito::UserPoolClient. There you can only get the client ID.
What you could do is to create the client in another CF template and either create there a custom resource to read the secret and output it, or have an intermediate step where you get this value with CLI and then pass it into serverless.
There is currently no other option.

How to read a value from a json file in AWS Codepipeline?

My question: How can Codepipeline read the value of a field in a json file which is in SourceCodeArtifact?
I have Gthub repo that contains a file imageManifest.json which looks like this:
{
"image_id": "docker.pkg.github.com/my-org/my-repo/my-app",
"image_version": "1.0.1"
}
I want my AWS Codepipeline Source stage to be able to read the value of image_version from imageManifest.json and pass it as a parameter to a CloudFormation action in a subsequent stage of my pipeline.
For reference, here is my source stage.
Stages:
- Name: GitHubSource
Actions:
- Name: SourceAction
ActionTypeId:
Category: Source
Owner: ThirdParty
Version: '1'
Provider: GitHub
OutputArtifacts:
- Name: SourceCodeArtifact
Configuration:
Owner: !Ref GitHubOwner
Repo: !Ref GitHubRepo
OAuthToken: !Ref GitHubAuthToken
And here is my deploy stage:
- Name: DevQA
Actions:
- Name: DeployInfrastructure
InputArtifacts:
- Name: SourceCodeArtifact
ActionTypeId:
Category: Deploy
Owner: AWS
Provider: CloudFormation
Version: '1'
Configuration:
StackName: !Ref AppName
Capabilities: CAPABILITY_NAMED_IAM
RoleArn: !GetAtt [CloudFormationRole, Arn]
ParameterOverrides: !Sub '{"ImageId": "${image_version??}"}'
Note that image_version in the last line above is just my aspirational placeholder to illustrate how I hope to use the image_version json value.
How can Codepipeline read the value of a field in a json file which is in SourceCodeArtifact?
StepFunctions? Lambda? CodeBuild?
You can use a CodeBuild step in between Source and Deploy stages.
In CodeBuild step, read the image_version from SourceArtifact (artifact produced by soruce stage) and write to an artifact 'Template configuration' file 1 which is a configuration property of the CloudFormation action. This file can hold parameter values for your CloudFormation stack. Use this file instead of ParameterOverrides you are currently using.
Fn::GetParam is what you want. It can returns a value from a key-value pair in a JSON-formatted file. And the JSON file must be included in an artifact.
Here is the documentation and it gives you some examples: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/continuous-delivery-codepipeline-parameter-override-functions.html#w2ab1c13c20b9
It should be something like:
ParameterOverrides: |
{
"ImageId" : { "Fn::GetParam" : ["SourceCodeArtifact", "imageManifest.json", "image_id"]}
}

How to pass Cognito UserPoolID, client secret to AWS Lambda during Cloudformation script execution?

I create Cloudformation script which creates AWS Cognito and deploys a set of AWS Lambda.
Cloudformation yaml looks like below:
UserPool:
Type: "AWS::Cognito::UserPool"
Properties:
UserPoolName: !Sub ${EnvPrefix}-smartshoesuserpool
Policies:
PasswordPolicy:
MinimumLength: 8
RequireUppercase: true
RequireLowercase: true
RequireNumbers: true
RequireSymbols: true
TemporaryPasswordValidityDays: 7
AutoVerifiedAttributes:
- email
AliasAttributes:
- email
EmailVerificationMessage: 'Your verification code is {####}. '
EmailVerificationSubject: Your verification code
VerificationMessageTemplate:
EmailMessage: 'Your verification code is {####}. '
EmailSubject: Your verification code
DefaultEmailOption: CONFIRM_WITH_CODE
MfaConfiguration: 'OFF'
EmailConfiguration:
EmailSendingAccount: COGNITO_DEFAULT
AdminCreateUserConfig:
AllowAdminCreateUserOnly: false
InviteMessageTemplate:
SMSMessage: 'Your username is {username} and temporary password is {####}. '
EmailMessage: 'Your username is {username} and temporary password is {####}. '
EmailSubject: Your temporary password
UsernameConfiguration:
CaseSensitive: false
AccountRecoverySetting:
RecoveryMechanisms:
- Priority: 1
Name: verified_email
- Priority: 2
Name: verified_phone_number
UserPoolTags:
Creator: !Ref CreatorUsername
Environment: !Ref EnvPrefix
# User Pool client
# eksport z uzyciem: aws cognito-idp describe-user-pool-client --user-pool-id eu-central-1_E5ZQHWb1N --client-id 7oasfnq1cld9sh4jajjap2g80p
UserPoolClient:
Type: "AWS::Cognito::UserPoolClient"
Properties:
UserPoolId: !Ref UserPool
ClientName: !Sub ${EnvPrefix}-smartshoesuserpoolclient
RefreshTokenValidity: 30
ReadAttributes:
- email
- email_verified
WriteAttributes:
- email
ExplicitAuthFlows:
- ALLOW_ADMIN_USER_PASSWORD_AUTH
- ALLOW_CUSTOM_AUTH
- ALLOW_REFRESH_TOKEN_AUTH
- ALLOW_USER_PASSWORD_AUTH
- ALLOW_USER_SRP_AUTH
AllowedOAuthFlowsUserPoolClient: false
PreventUserExistenceErrors: ENABLED
Simply it create UserPool and UserPoolClient.
But I have a problem because in Lambda function I have to know UserPoolId, UserClientId and ClientSecret and I have not found method to get this values inside Clorudformation yaml.
I can write short Python program using Boto3 that search UserPool and other values but I cannot execute it inside yaml. How do you get theses parameters and 'inject' to Lambda function during deployment phase?
.....
def initiate_auth(client, username, password):
secret_hash = get_secret_hash(username)
try:
resp = client.admin_initiate_auth(
UserPoolId=USER_POOL_ID,
ClientId=CLIENT_ID,
AuthFlow='ADMIN_NO_SRP_AUTH',
AuthParameters={
'USERNAME': username,
'SECRET_HASH': secret_hash,
'PASSWORD': password,
},
ClientMetadata={
'username': username,
'password': password,
})
....
I can write short Python program using Boto3 that search UserPool and other values but I cannot execute it inside yaml
You can consider developing a custom resource in CloudFormation. The resouce would a lambda function which could execute your python script, and return any needed values to other resources in your template.
However, if you also create your lambda functions in the same template, you can pass the IDs using function's Environment property:
Environment variables that are accessible from function code during execution.

Endpoint URL for DynamoDB inside localstack's Lambda function

I'm using localstack for local development. I have a DynamoDB table named readings and I'd like to insert items from a lambda function.
I have deployed simple lambda function in python runtime:
import os
import boto3
def lambda_handler(events, context):
DYNAMODB_ENDPOINT_URL = os.environ.get("DYNAMODB_ENDPOINT_URL")
dynamodb = boto3.resource("dynamodb", endpoint_url=DYNAMODB_ENDPOINT_URL)
readings_table = dynamodb.Table(DYNAMODB_READINGS_TABLE_NAME)
readings_table.put_item(Item={"reading_id": "10", "other": "test"})
But I'm getting error: [ERROR] EndpointConnectionError: Could not connect to the endpoint URL: "http://localstack:4569/"
I've tried combinations of localhost and localstack along with ports: 4566 and 4569. All of them fail.
Here's my docker-compse service that I use to start localstack
localstack:
image: localstack/localstack:0.11.2
ports:
- 4566:4566
- 8080:8080
environment:
SERVICES: "dynamodb,sqs,lambda,iam"
DATA_DIR: "/tmp/localstack/data"
PORT_WEB_UI: "8080"
LOCALSTACK_HOSTNAME: localstack
LAMBDA_EXECUTOR: docker
AWS_ACCESS_KEY_ID: "test"
AWS_SECRET_ACCESS_KEY: "test"
AWS_DEFAULT_REGION: "us-east-1"
volumes:
- localstack_volume:/tmp/localstack/data
- /var/run/docker.sock:/var/run/docker.sock
# When a container is started for the first time, it will execute files with extensions .sh that are found in /docker-entrypoint-initaws.d.
# Files will be executed in alphabetical order. You can easily create aws resources on localstack using `awslocal` (or `aws`) cli tool in the initialization scripts.
# source: https://github.com/localstack/localstack/pull/1018/files#diff-04c6e90faac2675aa89e2176d2eec7d8R185
- ./localstack-startup-scripts/:/docker-entrypoint-initaws.d/
What would be the correct endpoint url that I have to set inside in my lambda so that I can send requests to localstack's DynamoDB?
According to the docs LOCALSTACK_HOSTNAME is a read-only env var:
LOCALSTACK_HOSTNAME: Name of the host where LocalStack services are available. Use this hostname as endpoint (e.g., http://${LOCALSTACK_HOSTNAME}:4566) in order to access the services from within your Lambda functions (e.g., to store an item to DynamoDB or S3 from a Lambda).
try with
ports:
- "0.0.0.0:4566-4599:4566-4599"
Hope it helps
By following Robert Taylor's answer, I was able to get the following working in my Java Lambda (this should really be a comment in that answer, but code formatting for big snippet is not adequate in comment, so I'm sharing this as a separate answer to make people's life easier):
var url = System.getenv("LOCALSTACK_HOSTNAME");
var credentials = new BasicAWSCredentials("mock_access_key", "mock_secret_key");
var ep = new AwsClientBuilder.EndpointConfiguration(String.format("http://%s:4566",url), "us-east-1");
s3 = AmazonS3ClientBuilder.standard()
.withEndpointConfiguration(ep)
.withPathStyleAccessEnabled(true)
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.build();
dynamoDBMapper = new DynamoDBMapper(
AmazonDynamoDBClientBuilder.standard()
.withEndpointConfiguration(ep)
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.build()
);

Resources