Spring WebClient hangs when posting file / inputstream / resource to Minio - spring

I am developing a proxy service to a Minio server using WebClient that handles all Minio/S3 API endpoints. Most of them work fine, but I have encountered one case in which the PUT operation seems to get hung up when trying to set the body of the request to either an InputStream, a File, or a Resource pointing to it. (See epilogue at the bottom, as I'm left wondering where the problem really is.)
The only way I've found to make it work is to read the file contents to an in-memory byte array. The following baseline works, for example:
WebClient.UriSpec<WebClient.RequestBodySpec> uriSpec = client.method(request.getMethod());
WebClient.RequestBodySpec bodySpec = uriSpec.uri(uri);
WebClient.RequestHeadersSpec<?> headersSpec = bodySpec;
try {
// read file to byte array; works fine
byte[] bytes = Files.readAllBytes(Path.of(file.get().getFile().toURI()));
// set it to the request body
headersSpec = bodySpec.bodyValue(bytes);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
// manipulate some headers
headersSpec = headersSpec.headers(httpHeaders -> ...);
// execute the request; works fine in this scenario
return headersSpec.exchangeToMono(resp -> ...)
.doOnError(throwable -> log.error("Trouble proxying request: " + throwable.getMessage(), throwable));
However, every alternative that I try to stream this content instead, results in a request that seems to hang in the headersSpec.exchangeToMono invocation. I don't see any errors on the proxy service, and the client socket eventually gives up:
java.net.SocketTimeoutException: timeout
client-tester_1 | at okio.SocketAsyncTimeout.newTimeoutException(JvmOkio.kt:143) ~[okio-jvm-2.8.0.jar:na]
client-tester_1 | Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Some examples of failure (or, paraphrasing Edison, I've successfully found at least a dozen ways that do not work):
// Use same byte array as above; Hangs
Resource resource = new ByteArrayResource(bytes);
headersSpec = bodySpec.bodyValue(resource);
// Read an input stream from the file (this one relies on a HttpMessageWriter<InputStream> that I configured on the client); Hangs
InputStream bodyStream = new BufferedInputStream(Files.newInputStream(Path.of(file.get().getFile().toURI())));
headersSpec = bodySpec.bodyValue(bodyStream);
// Resource for the file; Hangs
Resource resource = new FileSystemResource(Path.of(file.get().getFile().toURI()));
Flux<DataBuffer> flux = DataBufferUtils.read(resource, DefaultDataBufferFactory.sharedInstance, 4096);
headersSpec = bodySpec.body(flux, DataBuffer.class);
// Different resource; Hangs
Resource resource = new UrlResource(file.get().getFile().toURI());
headersSpec = bodySpec.bodyValue(resource);
// Try BodyInserters; Hangs
Flux<DataBuffer> flux = DataBufferUtils.read(Path.of(file.get().getFile().toURI()), DefaultDataBufferFactory.sharedInstance, 4096);
headersSpec = bodySpec.body(BodyInserters.fromDataBuffers(flux));
// Yet another attempt; Take a guess...
InputStream bodyStream = new BufferedInputStream(Files.newInputStream(Path.of(file.get().getFile().toURI())));
headersSpec = bodySpec.body(BodyInserters.fromResource(resource));
I'm using recent versions of the relevant libraries:
org.springframework.boot:spring-boot-starter-webflux -> 2.7.5
org.springframework.boot:spring-boot-starter-reactor-netty:2.7.5
org.springframework:spring-core:5.3.23
Epiloge I'm wondering if the problem is not necessarily with Spring/WebClient/Netty -- as many of these code samples were inspired by other examples I've found -- but rather by some nuance on the Minio server?

Related

Spring MVC OutputStream buffer size / flush not blocking?

I have been trying to investigate the buffering behaviour of the HttpServletResponse OutputStream (Using tomcat in spring-boot). My expectation that flushing on the server side blocks until all data has been transmitted to the client.
I am trying to understand this, because I am noticing that clients seem to be processing much longer than the processing on the server.
My setup looks something like this:
Server:
#RequestMapping(path = "/some/url", method = RequestMethod.POST)
public void call(HttpServletRequest request, HttpServletResponse response)
{
OutputStream outputStream = response.getOutputStream();
... writing data here ...
outputStream.flush();
response.flushBuffer();
}
Some dummy client:
HttpURLConnection con = (HttpURLConnection) urlobj.openConnection();
con.setRequestMethod("POST");
con.setDoInput(true);
con.setDoOutput(true);
int nresp = 0;
final InputStream in = con.getInputStream();
while (true)
{
if (in.read() < 0)
break;
nresp++;
if ((nresp & 31) == 0)
Thread.sleep(1l); // slow down read
}
In this example, depending on the response size, the server finishes in 40ms, but the client is reading data for 35 seconds. Also in Wireshark I can see data being transmitted the entire time. Which means that the server is buffering the data.
I also tried with StreamingResponseBody as return value or with "response.setBufferSize(1024)", but the result is the same.
Does anybody understand where this buffering is occurring and if its behaving correctly?
Is there any way on the server to completely flush all buffers and blocking until data is really flushed to the client?
cheers, Christian

Tomcat Performance with Spring Boot API for File Upload

I have a Spring boot API and one of the endpoints allows users to upload video's. Now My controller basically takes the file as a MultiPart file and then I store it in a temp folder accessible to tomcat. Once I have it stored on Disk, I then push the video to an S3 bucket.
Now to me anyway, this seems to be less than optimal, Like if I wanted to have a 100 or a 1000 users upload at once it seems really non performant to write the files to disk first.
As a little background I'm storing it on disk with the intention that if there is a issue pushing to S3 I can retry
The below code might show what I'm doing better than the above:
public Video addVideo(#RequestParam("title") String title,
#RequestParam("Description") String Description,
#RequestParam(value = "file", required = true) MultipartFile file) {
this.amazonS3ClientService.uploadFileToS3Bucket(file, title, description));
}
Method for storing Video file:
String fileNameWithExtenstion = awsS3FileName + "." + FilenameUtils.getExtension(multipartFile.getOriginalFilename());
//creating the file in the server (temporarily)
File file = new File(tomcatTempDir + fileNameWithExtenstion);FileOutputStream fos = new FileOutputStream(file);
fos.write(multipartFile.getBytes());
fos.close();PutObjectRequest putObjectRequest = new PutObjectRequest(this.awsS3Bucket, awsS3BucketFolder + UnigueId + "/" + fileNameWithExtenstion, file);
if (enablePublicReadAccess) {
putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}
// Upload a file as a new object with ContentType and title
specified.amazonS3.putObject(putObjectRequest);
//removing the file created in the server
file.delete();
So my question is....is there a better way in Tomcat to:
A) Take in a file via a controllerB) Push to S3
There is no other way to do it with multipart. The problem with multipart that to properly segement parts from the requst they need sometimes skipped or be repeatable. That is impossible within memory w/o having memory to explode. Therefore, Commons FileUpload caches them on disk after a certain threshold is reached.
Multipart requests are the worst way for that. I highly recommend to use either PUT or POST with content type application/octet-stream. You can take the bare request input stream and pass to HttpClient to stream to your backend server. I did this already 5 years ago and it works for gigabytes. I have posted the solution in the Apache HttpClient mailing list.
There is one possibility how this could work under specific conditions:
All parts are in the correct physical order you want to read
Your write to a backend is fast enough to sustain the read from the front
Consume the root part and then go over to the next physical one, process the request body lazily. JAX-WS RI (Metro) has a very nice handling of multipart requests for XOP/MTOM. Learn from that because you won't be able to make it any better.
Perhaps you can try to direct stream the input stream from your MultipartFile to S3.
Consider the following uploadFileToS3Bucket method:
public PutObjectResult uploadFileToS3Bucket(InputStream input, long size, String title, String description) {
// Indicate the length of the information to avoid the need to compute it by the AWS SDK
// See: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html#PutObjectRequest-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentLength(size); // rely on Spring implementation. Maybe you probably also can use input.available()
// compute the object name as appropriate
String key = "...";
PutObjectRequest putObjectRequest = new PutObjectRequest(
this.awsS3Bucket, key, input, objectMetadata
);
// The rest of your code
if (enablePublicReadAccess) {
putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}
// Upload a file as a new object with ContentType and title
return specified.amazonS3.putObject(putObjectRequest);
}
Of course, you need to provide the service the input stream obtained from the client request associated with the MutipartFile object:
public Video addVideo(
#RequestParam("title") String title,
#RequestParam("Description") String Description,
#RequestParam(value = "file", required = true) MultipartFile file) {
try (InputStream input = file.getInputStream()) {
this.amazonS3ClientService.uploadFileToS3Bucket(input, file.getSize(), title, description));
}
}
Probably you can also play with the getBytes method of MultipartFile and create a ByteArrayInputStream to perform the operation.
In addVideo:
byte[] bytes = file.getBytes();
In uploadFileToS3Bucket:
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentLength(bytes.length);
PutObjectRequest putObjectRequest = new PutObjectRequest(
this.awsS3Bucket, key, new ByteArrayInputStream(bytes), objectMetadata
);
I would prefer the first solution, but try to determine which option offers you the best performance.

How to mock a multipart file upload when using Spring and Apache File Upload

The project I'm working on needs to support large file uploads and know the time taken during their upload.
To handle the large files I'm using the streaming API of Apache FileUpload, this also allows me to measure the time taken for the complete stream to be saved.
The problem I'm having is that I cannot seem to be able to utilise MockMvc in an Integration Test on this controller. I know that the controller works as I've successfully uploaded files using postman.
Simplified Controller Code:
#PostMapping("/upload")
public String handleUpload(HttpServletRequest request) throws Exception {
ServletFileUpload upload = new ServletFileUpload();
FileItemIterator iterStream = upload.getItemIterator(request);
while (iterStream.hasNext()) {
FileItemStream item = iterStream.next();
String name = item.getFieldName();
InputStream stream = item.openStream();
if (!item.isFormField()) {
// Process the InputStream
} else {
String formFieldValue = Streams.asString(stream);
}
}
}
Simplified Test Code:
private fun uploadFile(tfr: TestFileContainer) {
val mockFile = MockMultipartFile("file", tfr.getData()) // .getData*() returns a ByteArray
val receiveFileRequest = MockMvcRequestBuilders.multipart("/upload")
.file(mockFile)
.contentType(MediaType.MULTIPART_FORM_DATA)
val result = mockMvc.perform(receiveFileRequest)
.andExpect(status().isCreated)
.andExpect(header().exists(LOCATION))
.andReturn(
}
This is the error I'm currently getting
org.apache.tomcat.util.http.fileupload.FileUploadException: the
request was rejected because no multipart boundary was found
Can anyone help?
The MockMultipartFile approach won't work as Spring does work behind the scenes and simply passes the file around.
Ended up using RestTemplate instead as it actually constructs requests.

Sending gzipped data over HTTPS

I need to send a gzipped byte array over HTTPS. I searched the web and only thing ı can found is SharpGIS.GZipWebClient.
However, the problem is - this third party solution only works with WebClient which allow you to send only String data.
(I'm on Windows Phone 8. Most of the WebClient methods do not exist.)
Any ideas to solve this problem?
Edit:
This is how I tried the POST JSON data over HTTPS using SharpGIS;
WebClient webClient = new SharpGIS.GZipWebClient();
webClient.Headers["Accept-Encoding"] = "gzip";
var uri = new Uri(pUrl, UriKind.Absolute);
webClient.UploadStringCompleted += new UploadStringCompletedEventHandler(wc_UploadStringCompleted);
webClient.UploadStringTaskAsync(uri, jsonAsString);
But it doesn't compresses the string as well(as using OpenWriteSync method).
You write the post data in the OpenWriteCompleted handler, like this:
void webClient_OpenWriteCompleted(object sender, OpenWriteCompletedEventArgs e)
{
Stream s = e.Result;
s.Write(jsonAsByteArray, 0, jsonAsByteArray.Length);
s.Flush();
s.Close();
}
You should also add the appropriate error handling.

JUnit needs special permissions?

My builds have been failing due to some of the integration tests I've been running. I'm stuck on why it won't work. Here is an example of the output:
I'm using Maven to first build, then it calls the JUnit tests. I'm seeing this 401 Unauthorized message in every single test, and I believe that's what is causing the builds to fail. In my mind, this means there are some permissions / authentication parameters that need to be set. Where would I go about doing this in JUnit?
Edit
#Test
public void testXmlHorsesNonRunners() throws Exception {
String servletUrl = SERVER + "sd/date/2013-01-13/horses/nonrunners";
Document results = issueRequest(servletUrl, APPLICATION_XML, false);
assertNotNull(results);
// debugDocument(results, "NonRunners");
String count = getXPathStringValue(
"string(count(hrdg:data/hrdg:meeting/hrdg:event/hrdg:nonrunner/hrdg:selection))",
results);
assertEquals("non runners", "45", count);
}
If you can, try to ignore the detail. Effectively, this is making a request. This is a sample of a test that uses the issueRequest method. This method is what makes HTTP requests. (This is a big method, which is why I didn't post it originally. I'll try to make it as readable as possible.
logger.info("Sending request: " + servletUrl);
HttpGet httpGet = null;
// InputStream is = null;
DefaultHttpClient httpclient = null;
try {
httpclient = new DefaultHttpClient();
doFormLogin(httpclient, servletUrl, acceptMime, isIrishUser);
httpGet = new HttpGet(servletUrl);
httpGet.addHeader("accept", acceptMime);
// but more importantly now add the user agent header
setUserAgent(httpGet, acceptMime);
logger.info("executing request" + httpGet.getRequestLine());
// Execute the request
HttpResponse response = httpclient.execute(httpGet);
// Examine the response status
StatusLine statusLine = response.getStatusLine();
logger.info(statusLine);
switch (statusLine.getStatusCode()) {
case 401:
throw new HttpResponseException(statusLine.getStatusCode(),
"Unauthorized");
case 403:
throw new HttpResponseException(statusLine.getStatusCode(),
"Forbidden");
case 404:
throw new HttpResponseException(statusLine.getStatusCode(),
"Not Found");
default:
if (300 < statusLine.getStatusCode()) {
throw new HttpResponseException(statusLine.getStatusCode(),
"Unexpected Error");
}
}
// Get hold of the response entity
HttpEntity entity = response.getEntity();
Document doc = null;
if (entity != null) {
InputStream instream = entity.getContent();
try {
// debugContent(instream);
doc = documentBuilder.parse(instream);
} catch (IOException ex) {
// In case of an IOException the connection will be released
// back to the connection manager automatically
throw ex;
} catch (RuntimeException ex) {
// In case of an unexpected exception you may want to abort
// the HTTP request in order to shut down the underlying
// connection and release it back to the connection manager.
httpGet.abort();
throw ex;
} finally {
// Closing the input stream will trigger connection release
instream.close();
}
}
return doc;
} finally {
// Release the connection.
closeConnection(httpclient);
}
I notice that your test output shows HTTP/1.1 500 Internal Server Error a couple of lines before the 401 error. I wonder if the root cause could be hiding in there. If I were you I'd try looking for more details about what error happened on the server at that point in the test, to see if it could be responsible for the authentication problem (maybe the failure is in a login controller of some sort, or is causing a session to be cancelled?)
Alternately: it looks like you're using the Apache HttpClient library to do the request, inside the issueRequest method. If you need to include authentication credentials in the request, that would be the code you'd need to change. Here's an example of doing HTTP Basic authentication in HttpClient, if that helps. (And more examples, if that one doesn't.)
(I'd second the observation that this problem probably isn't specific to JUnit. If you need to do more research, I'd suggest learning more about HttpClient, and about what this app expects the browser to send. One possibility: use something like Chrome Dev Tools to peek at your communications with the server when you do this manually, and see if there's anything important that the test isn't doing, or is doing differently.
Once you've figured out how to login, it might make sense to do it in a #Before method in your JUnit test.)
HTTP permission denied has nothing to do with JUnit. You probably need to set your credentials while making the request in the code itself. Show us some code.
Also, unit testing is not really meant to access the internet. Its purpose is for testing small, concise parts of your code which shouldn't rely on any external factors. Integration tests should cover that.
If you can, try to mock your network requests using EasyMock or PowerMock and make them return a resource you would load from your local resources folder (e.g. test/resources).

Resources