I am uploading streams of (raw bytes) data using HTTP posts using WebClient:
final byte[] rawData = IOUtils.toByteArray(sourceInputStream);
webClient.post()
.uri(uri)
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.bodyValue(rawData)
.exchange()...
I am concerned there is a potentially a lot of memory used given sometimes these objects can be quite big (~200Mb) so would like to read directly from the InputStream and upload as a stream.
I tried:
bodyValue(BodyInserters.fromResource(new InputStreamResource(inputStream)))
but got exception Content type 'application/octet-stream' not supported for bodyType=org.springframework.web.reactive.function.BodyInserters
So I then tried removing the header but the data is then corrupted.
Is there a way to stream the data without passing through the in memory 'buffer' rawData[]?
Thanks
Your first try was almost correct, however you need to use body(...) instead of bodyValue(...):
body(BodyInserters.fromResource(new InputStreamResource(inputStream)))
This is because bodyValue(...) wraps your resource inserter in a value inserter, which will then try to serialize the resource inserter itself and fail with the error you received.
I ended up keeping the rawData[] buffer and specifying contentLength in the request:
webClient.post()
.uri(uri)
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.contentLength(bytes.length)
.bodyValue(bytes)
.exchange()
.block(Duration.ofSeconds(30));
For very large files I did chunked uploads - for example 1Mb chunks of a file might look like:
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/StartUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678')?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/ContinueUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678',fileOffset=1048576)?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/ContinueUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678',fileOffset=2097152)?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/ContinueUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678',fileOffset=3145728)?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/ContinueUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678',fileOffset=4194304)?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/ContinueUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678',fileOffset=5242880)?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
POST: https://sharepoint.acme.com:443/sharepoint-site/_api/web/GetFileByServerRelativeUrl(#f)/FinishUpload(uploadId=guid'680f93eb-8c6a-443d-87ad-1e1b324ea678',fileOffset=6291456)?#f='/sharepoint-site/folderA/subFolderB/etc/example.file'
Driven by working on the raw bytes:
final UUID id = UUID.randomUUID();
int offset = 0;
while (offset < bytesTotal) {
final URI uri;
if (offset == 0) {
uri = createStartUri(id, path, filename);
} else if (offset < bytesTotal - CHUNK_SIZE) {
uri = createContinueUri(id, offset, path, filename);
} else {
uri = createFinishUri(id, offset, path, filename);
}
final byte[] bytes = ArrayUtils.subarray(fileDataBytes, offset, offset + CHUNK_SIZE);
webClient.post().uri(uri).contentType(MediaType.APPLICATION_OCTET_STREAM).contentLength(bytes.length)
.bodyValue(bytes).exchange().block(Duration.ofSeconds(30));
offset += CHUNK_SIZE;
}
Related
I am developing a proxy service to a Minio server using WebClient that handles all Minio/S3 API endpoints. Most of them work fine, but I have encountered one case in which the PUT operation seems to get hung up when trying to set the body of the request to either an InputStream, a File, or a Resource pointing to it. (See epilogue at the bottom, as I'm left wondering where the problem really is.)
The only way I've found to make it work is to read the file contents to an in-memory byte array. The following baseline works, for example:
WebClient.UriSpec<WebClient.RequestBodySpec> uriSpec = client.method(request.getMethod());
WebClient.RequestBodySpec bodySpec = uriSpec.uri(uri);
WebClient.RequestHeadersSpec<?> headersSpec = bodySpec;
try {
// read file to byte array; works fine
byte[] bytes = Files.readAllBytes(Path.of(file.get().getFile().toURI()));
// set it to the request body
headersSpec = bodySpec.bodyValue(bytes);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
// manipulate some headers
headersSpec = headersSpec.headers(httpHeaders -> ...);
// execute the request; works fine in this scenario
return headersSpec.exchangeToMono(resp -> ...)
.doOnError(throwable -> log.error("Trouble proxying request: " + throwable.getMessage(), throwable));
However, every alternative that I try to stream this content instead, results in a request that seems to hang in the headersSpec.exchangeToMono invocation. I don't see any errors on the proxy service, and the client socket eventually gives up:
java.net.SocketTimeoutException: timeout
client-tester_1 | at okio.SocketAsyncTimeout.newTimeoutException(JvmOkio.kt:143) ~[okio-jvm-2.8.0.jar:na]
client-tester_1 | Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Some examples of failure (or, paraphrasing Edison, I've successfully found at least a dozen ways that do not work):
// Use same byte array as above; Hangs
Resource resource = new ByteArrayResource(bytes);
headersSpec = bodySpec.bodyValue(resource);
// Read an input stream from the file (this one relies on a HttpMessageWriter<InputStream> that I configured on the client); Hangs
InputStream bodyStream = new BufferedInputStream(Files.newInputStream(Path.of(file.get().getFile().toURI())));
headersSpec = bodySpec.bodyValue(bodyStream);
// Resource for the file; Hangs
Resource resource = new FileSystemResource(Path.of(file.get().getFile().toURI()));
Flux<DataBuffer> flux = DataBufferUtils.read(resource, DefaultDataBufferFactory.sharedInstance, 4096);
headersSpec = bodySpec.body(flux, DataBuffer.class);
// Different resource; Hangs
Resource resource = new UrlResource(file.get().getFile().toURI());
headersSpec = bodySpec.bodyValue(resource);
// Try BodyInserters; Hangs
Flux<DataBuffer> flux = DataBufferUtils.read(Path.of(file.get().getFile().toURI()), DefaultDataBufferFactory.sharedInstance, 4096);
headersSpec = bodySpec.body(BodyInserters.fromDataBuffers(flux));
// Yet another attempt; Take a guess...
InputStream bodyStream = new BufferedInputStream(Files.newInputStream(Path.of(file.get().getFile().toURI())));
headersSpec = bodySpec.body(BodyInserters.fromResource(resource));
I'm using recent versions of the relevant libraries:
org.springframework.boot:spring-boot-starter-webflux -> 2.7.5
org.springframework.boot:spring-boot-starter-reactor-netty:2.7.5
org.springframework:spring-core:5.3.23
Epiloge I'm wondering if the problem is not necessarily with Spring/WebClient/Netty -- as many of these code samples were inspired by other examples I've found -- but rather by some nuance on the Minio server?
I have a Spring boot API and one of the endpoints allows users to upload video's. Now My controller basically takes the file as a MultiPart file and then I store it in a temp folder accessible to tomcat. Once I have it stored on Disk, I then push the video to an S3 bucket.
Now to me anyway, this seems to be less than optimal, Like if I wanted to have a 100 or a 1000 users upload at once it seems really non performant to write the files to disk first.
As a little background I'm storing it on disk with the intention that if there is a issue pushing to S3 I can retry
The below code might show what I'm doing better than the above:
public Video addVideo(#RequestParam("title") String title,
#RequestParam("Description") String Description,
#RequestParam(value = "file", required = true) MultipartFile file) {
this.amazonS3ClientService.uploadFileToS3Bucket(file, title, description));
}
Method for storing Video file:
String fileNameWithExtenstion = awsS3FileName + "." + FilenameUtils.getExtension(multipartFile.getOriginalFilename());
//creating the file in the server (temporarily)
File file = new File(tomcatTempDir + fileNameWithExtenstion);FileOutputStream fos = new FileOutputStream(file);
fos.write(multipartFile.getBytes());
fos.close();PutObjectRequest putObjectRequest = new PutObjectRequest(this.awsS3Bucket, awsS3BucketFolder + UnigueId + "/" + fileNameWithExtenstion, file);
if (enablePublicReadAccess) {
putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}
// Upload a file as a new object with ContentType and title
specified.amazonS3.putObject(putObjectRequest);
//removing the file created in the server
file.delete();
So my question is....is there a better way in Tomcat to:
A) Take in a file via a controllerB) Push to S3
There is no other way to do it with multipart. The problem with multipart that to properly segement parts from the requst they need sometimes skipped or be repeatable. That is impossible within memory w/o having memory to explode. Therefore, Commons FileUpload caches them on disk after a certain threshold is reached.
Multipart requests are the worst way for that. I highly recommend to use either PUT or POST with content type application/octet-stream. You can take the bare request input stream and pass to HttpClient to stream to your backend server. I did this already 5 years ago and it works for gigabytes. I have posted the solution in the Apache HttpClient mailing list.
There is one possibility how this could work under specific conditions:
All parts are in the correct physical order you want to read
Your write to a backend is fast enough to sustain the read from the front
Consume the root part and then go over to the next physical one, process the request body lazily. JAX-WS RI (Metro) has a very nice handling of multipart requests for XOP/MTOM. Learn from that because you won't be able to make it any better.
Perhaps you can try to direct stream the input stream from your MultipartFile to S3.
Consider the following uploadFileToS3Bucket method:
public PutObjectResult uploadFileToS3Bucket(InputStream input, long size, String title, String description) {
// Indicate the length of the information to avoid the need to compute it by the AWS SDK
// See: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html#PutObjectRequest-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentLength(size); // rely on Spring implementation. Maybe you probably also can use input.available()
// compute the object name as appropriate
String key = "...";
PutObjectRequest putObjectRequest = new PutObjectRequest(
this.awsS3Bucket, key, input, objectMetadata
);
// The rest of your code
if (enablePublicReadAccess) {
putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}
// Upload a file as a new object with ContentType and title
return specified.amazonS3.putObject(putObjectRequest);
}
Of course, you need to provide the service the input stream obtained from the client request associated with the MutipartFile object:
public Video addVideo(
#RequestParam("title") String title,
#RequestParam("Description") String Description,
#RequestParam(value = "file", required = true) MultipartFile file) {
try (InputStream input = file.getInputStream()) {
this.amazonS3ClientService.uploadFileToS3Bucket(input, file.getSize(), title, description));
}
}
Probably you can also play with the getBytes method of MultipartFile and create a ByteArrayInputStream to perform the operation.
In addVideo:
byte[] bytes = file.getBytes();
In uploadFileToS3Bucket:
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentLength(bytes.length);
PutObjectRequest putObjectRequest = new PutObjectRequest(
this.awsS3Bucket, key, new ByteArrayInputStream(bytes), objectMetadata
);
I would prefer the first solution, but try to determine which option offers you the best performance.
I'm running on the IBM public cloud. I have apu connect to access the cloud foundry microservice. I've gone through many of the posts and tried various things and I can't seem to get this to work. Here are my property file config settings for spring boot:
# The name of the application
spring.application.name=xxxxx
# web base path
management.endpoints.web.base-path=/
# Embedded tomcat config
server.tomcat.max-swallow-size=256MB
server.tomcat.max-http-post-size=256MB
# File size values
spring.servlet.multipart.max-file-size=256MB
spring.servlet.multipart.max-request-size=256MB
spring.servlet.multipart.enabled=true
# Server specific values
input.server=xxx
input.rtm.bucket=xxx
storage.server.base=xxx
# Cloudant database info
input.events.db.name=xxxx
input.ait.info.db.name=xxxx
letter.number.db.name=xxxx
letter.gen.data.db.name=xxxx
# Query index design documents
query.pad.ait.info.index.name=xxxx
query.pad.ait.info.deisgn.doc=_xxxx
query.rfa.ltr.index.name=xxxx
query.rfa.ltr.design.doc=xxxx
# The logging levels of the application
logging.level.application=DEBUG
#logging.level.root=DEBUG
#logging.level.org.springframework.web=INFO
# Testing
unit.testing=false
integration.testing=true
# Jackson json config
spring.jackson.mapper.accept-case-insensitive-properties=true
Here is the REST api function for POSTing the file
#PostMapping(value = "/send/rtm/document/{errata}")
public #ResponseBody ResponseEntity<Object> receiveRtmDocument(#PathVariable("errata") String errata, #RequestParam("file") MultipartFile file)
I'm using spring boot 2.1.6 and have not updated anything in the POM file. I'm attempting to send a 5.8 MB file to the api and it gives me this error:
com.ibm.tools.cloud.exceptions.DataNotJsonException: <html>
<head><title>413 Request Entity Too Large</title></head>
<body bgcolor="white">
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>openresty</center>
</body>
</html>
at com.ibm.msc.gasm.sapt.input.AitInputManagement.sendRtmDocument(AitInputManagement.java:182)
at com.ibm.msc.gasm.sapt.test.InputServiceTester.performTest(InputServiceTester.java:142)
at com.ibm.msc.gasm.sapt.test.InputServiceTester.main(InputServiceTester.java:96)
Here is the send code I am using in java for the multipart. The only other headers I use that are not listed here are my authorization headers.
// Create the URL connection
HttpURLConnection conn = (HttpURLConnection) (new URL(requestUri)).openConnection();
if (content != null || multipartFile) conn.setDoOutput(true);
conn.setRequestMethod(method.toString());
// Set the headers
Enumeration<String> keys = headers.keys();
while (keys.hasMoreElements())
{
// Pull out the key
String key = keys.nextElement();
// Set the header
conn.setRequestProperty(key, headers.get(key));
}
// Set the accept header
if (acceptHeader != null) conn.setRequestProperty("Accept", acceptHeader);
// Set the content header
if (contentTypeHeader != null) conn.setRequestProperty("Content-Type", contentTypeHeader);
if (content != null)
{
// Set the content
DataOutputStream dos = new DataOutputStream(conn.getOutputStream());
if (content.isFileContent()) dos.write(content.getFileContentAsByteArray());
else if (content.isByteArrayContent()) dos.write(content.getContentAsByteArray());
else if (content.isStringContent()) dos.write(content.getStringContentAsByteArray());
// close the stream
dos.flush();
dos.close();
}
// Set the multipart file
if (multipartFile)
{
// Set the properties
conn.setUseCaches(false);
conn.setRequestProperty("Connection", "Keep-Alive");
conn.setRequestProperty("Cache-Control", "no-cache");
conn.setRequestProperty("Content-Type", "multipart/form-data;boundry=" + MP_BOUNDRY);
// Set the content
DataOutputStream dos = new DataOutputStream(conn.getOutputStream());
dos.writeBytes(MP_HYPHENS + MP_BOUNDRY + StringUtils.crlf);
dos.writeBytes("Content-Disposition: form-data: name=\"" + this.mpName + "\";filename=\"" + this.mpFileName + "\"" + StringUtils.crlf);
dos.writeBytes(StringUtils.crlf);
dos.write(IOUtils.toByteArray(new FileInputStream(this.mpFileNamePath)));
dos.writeBytes(StringUtils.crlf);
dos.writeBytes(MP_HYPHENS + MP_BOUNDRY + MP_HYPHENS + StringUtils.crlf);
// close the stream
dos.flush();
dos.close();
}
// Get the response
HttpResponseMessage response = null;
try
{
// Extract the stream
InputStream is = (conn.getResponseCode() >= HttpURLConnection.HTTP_BAD_REQUEST) ? conn.getErrorStream() : conn.getInputStream();
// Pull out the information
byte[] data = IOUtils.toByteArray(is);
// Set the response
response = new HttpResponseMessage(requestUri, HttpStatusCode.getType(conn.getResponseCode()), acceptHeader, data, conn.getResponseMessage());
}
catch (Throwable e)
{
throw new IOException(String.format("Error reading results from %s", requestUri), e);
}
// Close the request
conn.disconnect();
// Send request
return response;
I've tried several things, but I am not sure what I am missing. Anyone have any ideas how to fix this?
You need to change NGINX settings;
Add to config file next line
client_max_body_size 20M;
Use the form form to submit the file and accept it with MultipartFile. In this case (the other situation is not clear), the default file size is limited to 2M. If you want to upload a large file, you need to configure the file size.
https://www.cyberciti.biz/faq/linux-unix-bsd-nginx-413-request-entity-too-large/
Try these two in your application.properties
server.tomcat.max-swallow-size=XMB //maximum size of the request body/payload
server.tomcat.max-http-post-size=XMB //maximum size of entire POST request
X is your desired integer representing megabyte.
I'm trying to create controller that download large file using RxNetty
I write something stupid like
#RequestMapping(method = RequestMethod.GET, path = "largeFile")
public DeferredResult<ResponseEntity<byte[]>> largeFile() throws IOException {
Observable<ResponseEntity<byte[]>> observable = RxNetty.createHttpGet(URL)
.flatMap(AbstractHttpContentHolder::getContent)
.map(data -> {
byte[] bytes = new byte[data.readableBytes()];
data.readBytes(bytes);
return new ResponseEntity<>(bytes, HttpStatus.OK);
});
DeferredResult<ResponseEntity<byte[]>> deferredResult = new DeferredResult<>();
observable.subscribe(deferredResult::setResult, deferredResult::setErrorResult);
return deferredResult;
}
Nevertheless I have the following error:
Caused by: io.netty.handler.codec.TooLongFrameException: HTTP content length exceeded 1048576 bytes.
The default client in RxNetty 0.4.x aggregates HTTP payload which has a limit on the maximum content length. The exception you see is because of that limit. You can alter the default client using a PipelineConfigurator as shown in this example:
https://github.com/ReactiveX/RxNetty/blob/0.4.x/rxnetty-examples/src/main/java/io/reactivex/netty/examples/http/chunk/HttpChunkClient.java#L49
after which the payload will be chunked into multiple buffers.
Alternatively, if you know the max size, then you can use an appropriate payload aggregator in the configurator.
How can I get a pdf located in a file in a server's directory structure to load in a browser for users of a Spring MVC application?
I have googled this and found postings about how to generate PDFs, but their answers do not work in this situation. For example, this other posting is not relevant because res.setContentType("application/pdf"); in my code below does not solve the problem. Also, this other posting describes how to do it from a database but does not show full working controller code. Other postings had similar problems that caused them not to work in this case.
I need to simply serve up a file (not from a database) and have it been viewable by a user in their browser. The best I have come up with is the code below, which asks the user to download the PDF or to view it in a separate application outside the browser. What specific changes can I make to the specific code below so that the user automatically sees the PDF content inside their browser when they click on the link instead of being prompted to download it?
#RequestMapping(value = "/test-pdf")
public void generatePdf(HttpServletRequest req,HttpServletResponse res){
res.setContentType("application/pdf");
res.setHeader("Content-Disposition", "attachment;filename=report.pdf");
ServletOutputStream outStream=null;
try {
BufferedInputStream bis = new BufferedInputStream(
new FileInputStream(new File("/path/to", "nameOfThe.pdf")));
/*ServletOutputStream*/ outStream = res.getOutputStream();
//to make it easier to change to 8 or 16 KBs
int FILE_CHUNK_SIZE = 1024 * 4;
byte[] chunk = new byte[FILE_CHUNK_SIZE];
int bytesRead = 0;
while ((bytesRead = bis.read(chunk)) != -1) {outStream.write(chunk, 0, bytesRead);}
bis.close();
outStream.flush();
outStream.close();
}
catch (Exception e) {e.printStackTrace();}
}
Change
res.setHeader("Content-Disposition", "attachment;filename=report.pdf");
To
res.setHeader("Content-Disposition", "inline;filename=report.pdf");
You should also set the Content Length
FileCopyUtils is handy:
#Controller
public class FileController {
#RequestMapping("/report")
void getFile(HttpServletResponse response) throws IOException {
String fileName = "report.pdf";
String path = "/path/to/" + fileName;
File file = new File(path);
FileInputStream inputStream = new FileInputStream(file);
response.setContentType("application/pdf");
response.setContentLength((int) file.length());
response.setHeader("Content-Disposition", "inline;filename=\"" + fileName + "\"");
FileCopyUtils.copy(inputStream, response.getOutputStream());
}
}