I'm using the AWS Ruby SDK v2 to access various buckets across a couple of regions. Is it possible to determine the region of (at runtime) of each bucket before I access it, so I can avoid the error below, which I get if I configure the AWS S3 client with the wrong region?
The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.
I know I can shell out and use the command below and parse the response, but ideally, I want to stay within the Ruby SDK.
aws s3api get-bucket-location
I couldn't find any official document for this, but from the aws-sdk spec you should be able to use following code to get the region
client = Aws::S3::Client.new()
resp = client.get_bucket_location(bucket: bucket_name)
s3_region = resp.data.location_constraint
This one is calling the same API as aws s3api get-bucket-location
For aws-sdk (2.6.5), it is:
client.get_bucket_location(bucket: bucket_name).location_constraint
But then, now, how we get a list of buckets that belong to a specific region?
Related
How can I download download/get multiple objects from MinIO using a single call of MinIO Java SDK.
In this case it is not possible to achieve this using a single call. you'll need to do the following:
First you'll need to list all the files in the selected bucket using the listObjects call, once you have this list you can filter the objects to check if the file/files that you want are included in the bucket.
Once you have all the valid files list, you can use the getObject call to actually get the information of the file, al this point you can use this data-stream to create a zip file with all the files that you require.
Completed this then you can start streaming this file to start the download.
Please remember that the MinIO team is available on their public slack channel or by email to answer questions 24/7/365.
I have a CodePipeline with GitHub as a source, set up. I am trying to, without a success, pass a single secret parameter( in this case a Stripe secret key, currently defined in an .env file -> explaination down below ) to a specific Lambda during a Deployment stage in CodePipeline's execution.
Deployment stage in my case is basically a CodeBuild project that runs the deployment.sh script:
#! /bin/bash
npm install -g serverless#1.60.4
serverless deploy --stage $env -v -r eu-central-1
Explanation:
I've tried doing this with serverless-dotenv-plugin, which serves the purpose when the deployment is done locally, but when it's done trough CodePipeline, it returns an error on lambda's execution, and with a reason:
Since CodePipeline's Source is set to GitHub (.env file is not commited), whenever a change is commited to a git repository, CodePipeline's execution is triggered. By the time it reaches deployment stage, all node modules are installed (serverless-dotenv-plugin along with them) and when serverless deploy --stage $env -v -r eu-central-1 command executes serverless-dotenv-plugin will search for .env file in which my secret is stored, won't find it since there's no .env file because we are out of "local" scope, and when lambda requiring this secret triggers it will throw an error looking like this:
So my question is, is it possible to do it with dotenv/serverless-dotenv-plugin, or should that approach be discarded? Should I maybe use SSM Parameter Store or Secrets Manager? If yes, could someone explain how? :)
So, upon further investigation of this topic I think I have the solution.
SSM Parameter Store vs Secrets Manager is an entirely different topic, but for my purpose, SSM Paremeter Store is a choice that I chose to go along with for this problem. And basically it can be done in 2 ways.
1. Use AWS Parameter Store
Simply by adding a secret in your AWS Parameter Store Console, then referencing the value in your serverless.yml as a Lambda environement variable. Serverless Framework is able to fetch the value from your AWS Parameter Store account on deploy.
provider:
environement:
stripeSecretKey: ${ssm:stripeSecretKey}
Finally, you can reference it in your code just as before:
const stripe = Stripe(process.env.stripeSecretKey);
PROS: This can be used along with a local .env file for both local and remote usage while keeping your Lambda code the same, ie. process.env.stripeSecretKey
CONS: Since the secrets are decrypted and then set as Lambda environment variables on deploy, if you go to your Lambda console, you'll be able to see the secret values in plain text. (Which kinda indicates some security issues)
That brings me to the second way of doing this, which I find more secure and which I ultimately choose:
2. Store in AWS Parameter Store, and decrypt at runtime
To avoid exposing the secrets in plain text in your AWS Lambda Console, you can decrypt them at runtime instead. Here is how:
Add the secrets in your AWS Parameter Store Console just as in the above step.
Change your Lambda code to call the Parameter Store directly and decrypt the value at runtime:
import stripePackage from 'stripe';
const aws = require('aws-sdk');
const ssm = new aws.SSM();
const stripeSecretKey = ssm.getParameter(
{Name: 'stripeSecretKey', WithDecryption: true}
).promise();
const stripe = stripePackage(stripeSecretKey.Parameter.Value);
(Small tip: If your Lambda is defined as async function, make sure to use await keyword before ssm.getParameter(...).promise(); )
PROS: Your secrets are not exposed in plain text at any point.
CONS: Your Lambda code does get a bit more complicated, and there is an added latency since it needs to fetch the value from the store. (But considering it's only one parameter and it's free, it's a good trade-off I guess)
For the conclusion I just want to mention that all this in order to work will require you to tweak your lambda's policy so it can access Systems Manager and your secret that's stored in Parameter Store, but that's easily inspected trough CloudWatch.
Hopefully this helps someone out, happy coding :)
AWS has requested that the product I'm working on identifies requests that it makes to our users' S3 resources on their behalf so they can assess its impact.
To accomplish this, we have to set the User-Agent header for every upload request done against a S3 bucket from an EMR application. I'm wondering how this can be achieved?
Hadoop's doc mentions the fs.s3a.user.agent.prefix property (core-default.xml). However, the protocol s3a seems to be deprecated (Work with Storage and File Systems), so I'm not sure if this property will work.
To give a bit of more context what I need to do, with AWS Java SDK, it is possible to set the User-Agent header's prefix, for example:
AWSCredentials credentials;
ClientConfiguration conf = new ClientConfiguration()
.withUserAgentPrefix("APN/1.0 PARTNER/1.0 PRODUCT/1.0");
AmazonS3Client client = new AmazonS3Client(credentials, conf);
Then, every request's User-Agent http header will has a value similar to: APN/1.0 PARTNER/1.0 PRODUCT/1.0, aws-sdk-java/1.11.234 Linux/4.15.0-58-generic Java_HotSpot(TM)_64-Bit_Server_VM/25.201-b09 java/1.8.0_201. I need to achieve something similar when uploading files from an EMR application.
S3A is not deprecated in ASF hadoop; I will argue that it is now ahead of what EMR's own connector will do. If you are using EMR you may be able to use it, otherwise you get to work with what they implement.
FWIW in S3A we're looking at what it'd take to actually dynamically change the header for a specific query, so you go beyond specific users to specific hive/spark queries in shared clusters. Be fairly complex to do this though as you need to do it on a per request setting.
The solution in my case was to include a awssdk_config_default.json file inside the JAR submitted to EMR job. This file it used by AWS SDK to allow developers to override some custom settings.
I've added this json file within the JAR submitted to EMR with this content:
{
"userAgentTemplate": "APN/1.0 PARTNER/1.0 PRODUCT/1.0 aws-sdk-{platform}/{version} {os.name}/{os.version} {java.vm.name}/{java.vm.version} java/{java.version}{language.and.region}{additional.languages} vendor/{java.vendor}"
}
Note: passing the fs.s3a.user.agent.prefix property to EMR job didn't work. AWS EMR uses EMRFS when handling files stored in S3 which uses AWS SDK. I realized it because of an exception thrown in AWS EMR that I see sometimes, part of its stack trace was:
Caused by: java.lang.ExceptionInInitializerError: null
at com.amazon.ws.emr.hadoop.fs.files.TemporaryDirectoriesGenerator.createAndTrack(TemporaryDirectoriesGenerator.java:144)
at com.amazon.ws.emr.hadoop.fs.files.TemporaryDirectoriesGenerator.createTemporaryDirectories(TemporaryDirectoriesGenerator.java:93)
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.create(S3NativeFileSystem.java:616)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:932)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:825)
at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.create(EmrFileSystem.java:217)
at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
I'm posting the answer here to future references. Some interests links:
The class in AWS SDK that uses this configuration file: InternalConfig.java
https://stackoverflow.com/a/31173739/1070393
EMRFS
I am trying to rotate the user access keys & secret keys for all the users, last time when it was required I did it manually but now I want to do it by a rule or automation
I went through some links and found this link
https://github.com/miztiik/serverless-iam-key-sentry
with this link, I tried to use but I was not able to perform the activity, it was always giving me the error, can anyone please or suggest any better way to do it?
As I am new to aws lamda also I am not sure that how my code can be tested?
There are different ways to implements a solution. One common way you can automate this is through a storing the IAM user access keys in Secret Manager for safely storing the keys. Next, you could configure a monthly or 90 days check to rotate the keys utilizing the AWS CLI and store the new keys within AWS Secrets Manager. You could use an SDK of your choice for this.
I'd like to get EC2 instances metadata with Ansible and do something with those instances based on the metadata. However, ec2_facts wants to SSH into instances in order to get the metadata.
I believe it should be possible to obtain the instances metadata without SSH connections.
Could you help me with that please?
Thank you.
There is information you can retrieve about instances using the aws API but ec_facts does not use it. What that Ansible module does specifically is fetch metadata via http://169.254.169.254/latest/meta-data/ which can only be done from the instance itself.
Some more information about what instance data you wish to fetch would be helpful to know. At this time there is no aws cloud module in core that will retrieve general information about an instance but Ansible makes it easy to write one.
Here is an example of a module that returns information about instances that match a set of tags - https://github.com/edx/configuration/blob/master/playbooks/library/ec2_lookup