I am trying to write dataframe into different account s3 bucket as json output.
The following code is failing with S3 Access denied error in GLUE Spark Streaming job. But If I run the code without first line in following code, it works and output it written into S3 bucket
glueContext._jsc.hadoopConfiguration().set("fs.s3.canned.acl", "BucketOwnerFullControl")
glueContext.write_dynamic_frame.from_options(frame=dynamic_df, connection_type="s3",
connection_options={"path": output_path},
format=file_format, transformation_ctx="datasink")
Here is the error log:
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception:
Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
AccessDenied; Request ID: W52125NY7G3EF7WH; S3 Extended Request ID:
4t9JOJedv2qNRy6W8ySxdQQ7r+TMN1MWpZCFOK1IKO6W4gx4a2oKuK5vwXUPnh4HkkPAG+LnEIc=;
Proxy: null), S3 Extended Request ID: 4t9JOJedv2qUPnh4HkkPAG+LnEIc= at
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1819)
at
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1403)
at
This looks strange to me as Bucket has full permission and it works perfectly when 2nd line alone executed but bucket owner is still glue account which I am trying to change it using fs.s3.canned.acl.
The destination bucket also setup with Bucket owner preferred option
Please suggest me what I am doing wrong.
Thanks
It's not enough to have permission in bucket policies only.
Glue role was missing s3:PutObjectAcl permission in IAM.
Related
When I deploy a Lambda "code" using CDK the deploy process (cloudformation running under presumably my user) does not have seem to have access to the bucket that holds the Lambda code.
I followed this tutorial: https://intro-to-cdk.workshop.aws/what-is-cdk.html and see this error when I run cdk deploy:
Lambda8C48573D) Your access has been denied by S3, please make sure your request credentials have permission to GetObject for cdktoolkit-stagingbucket-19kn1ypcmzq2q/assets/5327df
Lambda Code:
const handler = new lambda.Function(this, "TimestreamLambda", {
runtime: lambda.Runtime.NODEJS_10_X,
code: lambda.Code.fromAsset(path.join(__dirname, '../resources')),
handler: "index.hello_world",
...
cdk and #aws-cdk version is 1.73.0 but I also tried with 1.71.0
Notes:
I see the bucket under my account (in my region).
When logged into this account I can see and download the asset file
the downloaded zip file has the correct contents.
More error details:
12/24 | 9:15:19 PM | CREATE_FAILED | AWS::Lambda::Function | TimestreamLambda (TimestreamLambda8C48573D) Your access has been denied by S3, please make sure your request credentials have permission to GetObject for cdktoolkit-stagingbucket-28hiljazvaim/assets/5327df740bdc9c380ff567xxxxxxxxxxx7a68a.zip. S3 Error Code: AccessDenied. S3 Error Message: Access Denied (Service: AWSLambdaInternal; Status Code: 403; Error Code: AccessDeniedException; Request ID: 1b813776-7647-4767-89bc-XXXXXXXXX; Proxy: null)
new Function (/Users/<user>/dev/cdk/cdk-workshop/node_modules/#aws-cdk/aws-lambda/lib/function.ts:593:35)
\_ new CdkWorkshopStack (/Users/<user>/dev/cdk/cdk-workshop/lib/cdk-workshop-stack.ts:33:21)
I also see this (using the -v option) during deploy:
env: {
CDK_DEFAULT_REGION: 'us-west-2',
CDK_DEFAULT_ACCOUNT: '94646XXXXX',
CDK_CONTEXT_JSON: '{"#aws-cdk/core:enableStackNameDuplicates":"true","aws-cdk:enableDiffNoFail":"true","#aws-cdk/core:stackRelativeExports":"true","aws:cdk:enable-path-metadata":true,"aws:cdk:enable-asset-metadata":true,"aws:cdk:version-reporting":true,"aws:cdk:bundling-stacks":["*"]}',
CDK_OUTDIR: 'cdk.out',
CDK_CLI_ASM_VERSION: '7.0.0',
CDK_CLI_VERSION: '1.73.0'
}
As it turns out this was an issue with the internal authentication system my company uses to access AWS. Instead of using my regular AWS account to access I had to create a temporary account (which also sets a temporary token).
I am following the Ruby code sample to add a custom metrics to stackdriver, however, I keep getting the permission denied error.
client = Google::Cloud::Monitoring::Metric.new
project_name = Google::Cloud::Monitoring::V3::MetricServiceClient.project_path project_id
descriptor = Google::Api::MetricDescriptor.new(
type: "custom.googleapis.com/my_metric#{random_suffix}",
metric_kind: Google::Api::MetricDescriptor::MetricKind::GAUGE,
value_type: Google::Api::MetricDescriptor::ValueType::DOUBLE,
description: "This is a simple example of a custom metric."
)
result = client.create_metric_descriptor project_name, descriptor
the error I got is "Google::Gax::PermissionDeniedError (GaxError RPC failed, caused by 7:Permission monitoring.metricDescriptors.create denied (or the resource may not exist).)"
The environment variable GOOGLE_APPLICATION_CREDENTIALS is set, and it works fine for the Google Cloud Storage code below
storage = Google::Cloud::Storage.new project: project_id
# Make an authenticated API request
storage.buckets.each do |bucket|
puts bucket.name
end
At this point, I don't know what is the problem. Do I need to set up a different credential for Cloud Monitoring?
I am trying to deploy a python lambda to aws. This lambda just reads files from s3 buckets when given a bucket name and file path. It works correctly on the local machine if I run the following command:
sam build && sam local invoke --event testfile.json GetFileFromBucketFunction
The data from the file is printed to the console. Next, if I run the following command the lambda is packaged and send to my-bucket.
sam build && sam package --s3-bucket my-bucket --template-file .aws-sam\build\template.yaml --output-template-file packaged.yaml
The next step is to deploy in prod so I try the following command:
sam deploy --template-file packaged.yaml --stack-name getfilefrombucket --capabilities CAPABILITY_IAM --region my-region
The lambda can now be seen in the lambda console, I can run it but no contents are returned, if I change the service role manually to one which allows s3 get/put then the lambda works. However this undermines the whole point of using the aws sam cli.
I think I need to add a policy to the template.yaml file. This link here seems to say that I should add a policy such as one shown here. So, I added:
Policies: S3CrudPolicy
Under 'Resources:GetFileFromBucketFunction:Properties:', I then rebuild the app and re-deploy and the deployment fails with the following errors in cloudformation:
1 validation error detected: Value 'S3CrudPolicy' at 'policyArn' failed to satisfy constraint: Member must have length greater than or equal to 20 (Service: AmazonIdentityManagement; Status Code: 400; Error Code: ValidationError; Request ID: unique number
and
The following resource(s) failed to create: [GetFileFromBucketFunctionRole]. . Rollback requested by user.
I delete the stack to start again. My thoughts were that 'S3CrudPolicy' is not an off the shelf policy that I can just use but something I would have to define myself in the template.yaml file?
I'm not sure how to do this and the docs don't seem to show any very simple use case examples (from what I can see), if anyone knows how to do this could you post a solution?
I tried the following:
S3CrudPolicy:
PolicyDocument:
-
Action: "s3:GetObject"
Effect: Allow
Resource: !Sub arn:aws:s3:::${cloudtrailBucket}
Principal: "*"
But it failed with the following error:
Failed to create the changeset: Waiter ChangeSetCreateComplete failed: Waiter encountered a terminal failure state Status: FAILED. Reason: Invalid template property or properties [S3CrudPolicy]
If anyone can help write a simple policy to read/write from s3 than that would be amazing? I'll need to write another one so get lambdas to invoke others lambdas as well so a solution here (I imagine something similar?) would be great? - Or a decent, easy to use guide of how to write these policy statements?
Many thanks for your help!
Found it!! In case anyone else struggles with this you need to add the following few lines to Resources:YourFunction:Properties in the template.yaml file:
Policies:
- S3CrudPolicy:
BucketName: "*"
The "*" will allow your lambda to talk to any bucket, you could switch for something specific if required. If you leave out 'BucketName' then it doesn't work and returns an error in CloudFormation syaing that S3CrudPolicy is invalid.
when i follow the steps from https://developers.google.com/sheets/api/quickstart/go
and
run as go run quickstart.go
I got Error as below
2017/08/03 12:29:22 Unable to retrieve data from sheet. Get https://sheets.googleapis.com/v4/spreadsheets/14FXalPXVUHZ2SyNBUWJpfSzUSSimYYIR5mUU36r6_BQ/values/A%3AC?alt=json: oauth2: cannot fetch token: 401 Unauthorized
Response: {
"error" : "unauthorized_client"
}
exit status 1
We have to delete credential file in our system because after 1 day or certain time access token of oauth2 protocol got expired.For new access token,you need to delete credential file and run program again.
I am following the steps mentioned on the AWS to use an interactive Hive session using SSH.
I used the following resources
https://github.com/ucbtwitter/getting-started/wiki/Using-Elastic-Map-Reduce-via-Command-Line
http://docs.amazonwebservices.com/ElasticMapReduce/latest/GettingStartedGuide/SignUp.html
I was getting this error initially "Error: Missing key access-id" and then I fixed my JSON file. The JSON file is in the same format as mentioned in the above links.
When I run this command
./elastic-mapreduce
I am getting the following error :-
Error: Unable to parse credentials.json: can't convert String into Integer.
I checked the values required in JSON at AWS as well.
Does anyone has an idea why am I getting this error?
The region value in the credentials.json must be of int type.
{......
......
"region": 1
}