Getting 403 while downloading files from a certain folder in an Amazon S3 bucket in Spring boot application - spring

I am using one S3 bucket for my Spring boot application.
Here I have created the folder and uploaded files within this in S3 bucket from my Spring boot application with the help of the following upload function. Now, while I am listing the files within the folder, I am able to see them. But I cannot download them, getting 403 always.
Code snippet for uploading, listing the objects and downloading thereafter:
//Download is failing
public File downloadObject(String filePath) {
File file = null;
log.info("Downloading object {} from s3 bucket {}", filePath, bucketName);
try {
file = File.createTempFile(filePath, "");
file.deleteOnExit();
amazonS3.getObject(new GetObjectRequest(bucketName, filePath), file);
} catch (Exception exception) {
exception.stackTrace();
}
return file;
}
//Following function is working perfectly fine
public List<String> listObjects(String pathPrefix) {
final ListObjectsV2Result listingResponse = amazonS3.listObjectsV2(new ListObjectsV2Request()
.withPrefix(pathPrefix)
.withBucketName(bucketName));
if (Objects.nonNull(listingResponse)) {
List<String> result = listingResponse.getObjectSummaries().stream().map(
S3ObjectSummary::getKey).collect(
Collectors.toList());
result.remove(pathPrefix);
return result;
}
return Collections.emptyList();
}
//uploading is also working fine
public void uploadFile(InputStream inputStream, String filePath)
{
try {
amazonS3.putObject(new PutObjectRequest(bucketName, filePath, inputStream, null));
} catch (SdkClientException exception) {
exception.stackTrace();
}
}
S3 bucket permission is as follows:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAReadWriteAccessToBucket",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456:role/abcd"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::test-bucket/*"
}
]
}
You can see, as per the bucket policy, I have given every permission. Even after this, why the download is failing, not able to figure out. Please help.

First thing I am noticing is you are using the old V1 S3 API. Amazon strongly recommends moving to the AWS SDK for Java V2.
The AWS SDK for Java 2.x is a major rewrite of the version 1.x code base. It’s built on top of Java 8+ and adds several frequently requested features. These include support for non-blocking I/O and the ability to plug in a different HTTP implementation at run time.
The Amazon S3 V2 Java API work nicely in a Spring application. There is a multi service example that shows use of the S3 V2 Java API within a Spring BOOT app. In this use case, we get a byte[] to pass to the Amazon Rekognition service.
To get a byte[] from an object in an Amazon S3 bucket (which is what i assume you mean when you use the word download), you can use V2 code like this:
public byte[] getObjectBytes (String bucketName, String keyName) {
s3 = getClient();
try {
// Create a GetObjectRequest instance
GetObjectRequest objectRequest = GetObjectRequest
.builder()
.key(keyName)
.bucket(bucketName)
.build();
// Get the byte[] from this S3 object
ResponseBytes<GetObjectResponse> objectBytes = s3.getObjectAsBytes(objectRequest);
byte[] data = objectBytes.asByteArray();
return data;
} catch (S3Exception e) {
System.err.println(e.awsErrorDetails().errorMessage());
System.exit(1);
}
return null;
}
Refer to this end to end example that shows you how to perform this use case in a Spring app. Look at the code in the S3Service class.
Creating an example AWS photo analyzer application using the AWS SDK for Java
I just ran this app and it works perfectly...

Related

Is it possible to save a group of files with MinIO client in one transaction?

I have a Spring Boot application which stores files on a MinIO server. My application receives groups of files and should save all files per each group or save nothing in a problem group. I use io.minio.MinioClient#putObject for each file in a group. Now my code looks like
fun saveFile(folderName: String, fileName: String, file: ByteArray) {
file.inputStream().use {
minioClient.putObject(folderName, fileName, it, PutObjectOptions(file.size.toLong(), -1))
}
}
fun saveFiles(folderName: String, files: Map<String, ByteArray>) {
try {
files.forEach { (fileName, file) -> saveFile(folderName, fileName, file) }
} catch (e: Exception) {
files.forEach { (fileName, _) -> minioClient.removeObject(folderName, fileName) }
throw e
}
}
I wonder how I could refactor my saveFiles method to make it more transactional.
N.B. There are no rules about reading files by groups - each file could be read individually.
You can try use this S3 feature, MinIO also support this feature.
Create .tar or .zip archive and send to S3 with metadata option snowball-auto-extract=true (header: X-Amz-Meta-Snowball-Auto-Extract), archive will be automatically extracted in S3.
This is not transaction but look very similar for me.

AWS Signed requests when indexing documents with Spring Data Elasticsearch

I'm unable to index a document in the AWS-hosted Elasticsearch cluster using signed requests.
Infrastructure setup
Elasticsearch version: 7.4
Access policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "es:*",
"Resource": "arn:aws:es:<RESOURCE>/*"
}
]
}
Code
The following code loads the client libraries using version 7.6. I have also downgraded them to match the cluster version but with no effect.
build.gradle
// ...
implementation("org.springframework.data:spring-data-elasticsearch")
implementation("org.elasticsearch:elasticsearch")
implementation("org.elasticsearch.client:elasticsearch-rest-high-level-client")
// ...
The client configuration definition. The environment variables like AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_PROFILE are filled.
#Configuration
public class ElasticsearchClientConfig extends AbstractElasticsearchConfiguration {
#Value("${elasticsearch.host}")
private String elasticsearchHost;
#Value("${elasticsearch.port}")
private int elasticsearchPort;
#Override
#Bean
public RestHighLevelClient elasticsearchClient() {
var SERVICE_NAME = "es";
var REGION = "us-east-1";
var defaultCP = new DefaultAWSCredentialsProviderChain();
AWS4Signer signer = new AWS4Signer();
signer.setServiceName(SERVICE_NAME);
signer.setRegionName(REGION);
HttpRequestInterceptor interceptor = new AWSRequestSigningApacheInterceptor
(SERVICE_NAME, signer, defaultCP);
RestClientBuilder restClientBuilder = RestClient
.builder(HttpHost.create(elasticsearchHost))
.setHttpClientConfigCallback(hacb -> hacb.addInterceptorLast(interceptor));
return new RestHighLevelClient(restClientBuilder);
}
}
Where the AWSRequestSigningApacheInterceptor is taken from here.
So far so good. When the application loads it's accessing the cluster and manages to create relevant indices correctly.
Problem
There problem is when performing save() operation from Spring Data repository. There are two requests made to ES
#Override
public <S extends T> S save(S entity) {
Assert.notNull(entity, "Cannot save 'null' entity.");
operations.save(entity, getIndexCoordinates());
operations.indexOps(entity.getClass()).refresh();
return entity;
}
Looking at the logs the first one succeeds. The following error ends the second call
org.elasticsearch.client.ResponseException: method [POST], host [HOST], URI [/asset/_refresh?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true], status line [HTTP/1.1 403 Forbidden]
{"message":"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details."}
Looking at more detailed logs for both operations
Call for saving (ends with 200 status code):
com.amazonaws.auth.AWS4Signer : AWS4 Canonical Request: '"PUT
/asset/_doc/2
timeout=1m
content-length:128
content-type:application/json
host:<HOST>
user-agent:Apache-HttpAsyncClient/4.1.4 (Java/11.0.2)
x-amz-date:20200715T110349Z
content-length;content-type;host;user-agent;x-amz-date
55c1faf282ca0da145667bf7632f667349dbe30ed1edc64439cec2e8d463e176"
2020-07-15 13:03:49.240 DEBUG 3942 --- [nio-8080-exec-1] com.amazonaws.auth.AWS4Signer : AWS4 String to Sign: '"AWS4-HMAC-SHA256
20200715T110349Z
20200715/us-east-1/es/aws4_request
76b6547ad98145ef7ad514baac4ce67fa885bd56073e9855757ade19e28f6fec"
Call for refreshing (ends with 403 status code):
com.amazonaws.auth.AWS4Signer : AWS4 Canonical Request: '"POST
/asset/_refresh
host:<HOST>
user-agent:Apache-HttpAsyncClient/4.1.4 (Java/11.0.2)
x-amz-date:20200715T110349Z
host;user-agent;x-amz-date
bbe4763d6a0252c6e955bcc4884e15035479910b02395548dbb16bcbad1ddf95"
2020-07-15 13:03:49.446 DEBUG 3942 --- [nio-8080-exec-1] com.amazonaws.auth.AWS4Signer : AWS4 String to Sign: '"AWS4-HMAC-SHA256
20200715T110349Z
20200715/us-east-1/es/aws4_request
189b39cf0475734e29c7f9cd5fd845fc95f73c95151a3b6f6d430b95f6bee47e"
When indexing documents directly using lower-level clients everything works fine. I suspect that signature calculation behaves incorrectly for subsequent API calls.
I had a same issue, in my case i'm using AWSRequestSigningApacheInterceptor, and i had old version. after upgrade to latest version, it's fixed.

Apache Ozone + AWS S3 .Net API: PutObject is creating a bucket instead of a key

I am trying to create keys in apache OZone using AWS S3 API for .NET.
The key I am trying to create must be inside a bucket called "test" that I created using AWS S3 CLI.
My code:
static async Task WriteFile()
{
AmazonS3Config config = new AmazonS3Config();
config.ServiceURL = "http://myApacheOzoneEndpoint:8744"; // This port is mapped from a docker container to (not the original endpoint port for Ozone)
AWSCredentials credentials = new BasicAWSCredentials("testuser/scm#EXAMPLE.COM", "c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999"); // Credentials must be set but can be random since Ozone doesn't use authentication
AmazonS3Client client = new AmazonS3Client(credentials, config);
using (FileStream fs = File.OpenRead(#"C:\Users\me\path.to.file\image.jpg"))
{
string responseBody = "";
try
{
PutObjectRequest request = new PutObjectRequest
{
BucketName = "test",
Key = "deleteme.jpg",
InputStream = fs
};
PutObjectResponse response = await client.PutObjectAsync(request);
Console.WriteLine($"Result: {response.HttpStatusCode.ToString()}");
}
catch (AmazonS3Exception e)
{
Console.WriteLine("Error encountered ***. Message:'{0}' when writing an object", e.Message);
}
catch (Exception e)
{
Console.WriteLine("Unknown encountered on server. Message:'{0}' when writing an object", e.Message);
}
}
}
This code is returning a 200 code error (OK), and if I call GetObjectAsync from the API I get an HTML with metadata as result (But can't read the file content by the moment)
Then I go to Apache Ozone and execute these commands in within the ozone shell:
bash-4.2$ ozone s3 path test
Volume name for S3Bucket is : s3c89e813c80ffcea9543004d57b2a1239
Ozone FileSystem Uri is : o3fs://test.s3c89e813c80ffcea9543004d57b2a1239
bash-4.2$ ozone sh bucket list /s3c89e813c80ffcea9543004d57b2a1239
[ {
"volumeName" : "s3c89e813c80ffcea9543004d57b2a1239",
"bucketName" : "test",
"createdOn" : "Wed, 01 Apr 2020 08:06:40 GMT",
"acls" : null,
"versioning" : "DISABLED",
"storageType" : "DISK",
"encryptionKeyName" : "N/A"
}, {
"volumeName" : "s3c89e813c80ffcea9543004d57b2a1239",
"bucketName" : "deleteme.jpg",
"createdOn" : "Tue, 31 Mar 2020 10:57:26 GMT",
"acls" : null,
"versioning" : "DISABLED",
"storageType" : "DISK",
"encryptionKeyName" : "N/A"
} ]
bash-4.2$ ozone sh key list /s3c89e813c80ffcea9543004d57b2a1239/test => This command returns only keys that have been put from AWS S3 CLI
As you can see, the object is created as a new bucket inside the Ozone volume used for S3, instead of being created as a new key below testbucket. If I try to put keys from AWS S3 Cli it works as expected
What is happening?
FYI I made the same example from Java API and the same issue occurs!
Thanks
Ozone s3-gateway uses by default the path-style addressing while updated sdk libraries use the virtual-hosted addressing. The quickest solution would be to switch to path-style:
// AmazonS3Config config = new AmazonS3Config();
config.ForcePathStyle = true;
Alternatively, as mentioned on the docs, you could enable the virtual-hosted schema in ozone.
Please notice that the path-style is going to be deprecated in aws s3.

WebTestClient with multipart file upload

I'm building a microservice using Spring Boot + Webflux, and I have an endpoint that accepts a multipart file upload. Which is working fine when I test with curl and Postman
#PostMapping("/upload", consumes = [MULTIPART_FORM_DATA_VALUE])
fun uploadVideo(#RequestPart("video") filePart: Mono<FilePart>): Mono<UploadResult> {
log.info("Video upload request received")
return videoFilePart.flatMap { video ->
val fileName = video.filename()
log.info("Saving video to tmp directory: $fileName")
val file = temporaryFilePath(fileName).toFile()
video.transferTo(file)
.thenReturn(UploadResult(true))
.doOnError { error ->
log.error("Failed to save video to temporary directory", error)
}
.onErrorMap {
VideoUploadException("Failed to save video to temporary directory")
}
}
}
I'm now trying to test using WebTestClient:
#Test
fun shouldSuccessfullyUploadVideo() {
client.post()
.uri("/video/upload")
.contentType(MULTIPART_FORM_DATA)
.syncBody(generateBody())
.exchange()
.expectStatus()
.is2xxSuccessful
}
private fun generateBody(): MultiValueMap<String, HttpEntity<*>> {
val builder = MultipartBodyBuilder()
builder.part("video", ClassPathResource("/videos/sunset.mp4"))
return builder.build()
}
The endpoint is returning a 500 because I haven't created the temp directory location to write the files to. However the test is passing even though I'm checking for is2xxSuccessful if I debug into the assertion that is2xxSuccessful performs, I can see it's failing because of the 500, however I'm still getting a green test
Not sure what I am doing wrong here. The VideoUploadException that I map to simply extends ResponseStatusException
class VideoUploadException(reason: String) : ResponseStatusException(HttpStatus.INTERNAL_SERVER_ERROR, reason)

Whats the right BlobStorageService Configuration format?

When creating a Microsoft Bot Framework 4 project - the Startup.cs has the following code which can be uncommented.
const string StorageConfigurationId = "<NAME OR ID>";
var blobConfig = botConfig.FindServiceByNameOrId(StorageConfigurationId);
if (!(blobConfig is BlobStorageService blobStorageConfig))
{
throw new InvalidOperationException($"The .bot file does not contain an blob storage with name '{StorageConfigurationId}'.");
}
This code handles a way to configure an Azure Storage Account via Json Configuration.
However the project lacks an example on what the config Json looks like for the "is BlobStorageService" to work.
I have done various tries and searched for examples but cannot make it work.
Has anyone got the nailed?
Got it working using this json...
{
"type": "blob", //Must be 'blob'
"name": "<NAME OF CONFIG - MUST BE UNIQUE (CAN BE ID)>",
"connectionString": "<COPY FROM AZURE DASHBOARD>",
"container": "<NAME OF CONTAINER IN STORAGE>"
}

Resources