Transferring big files in spring integration - spring

The spring integration flow I wrote has to get files (some of them are as big as 4G) from a rest service and transfer them to a remote shared drive. For downloading them from the rest service I configured this simple component:
#Bean
public HttpRequestExecutingMessagehandler httpDownloader (RestTemplate template){
Expression expr = (new SpelExpressionParser()).parseExpression("payload.url");
HttpRequestExecutingMessagehandler handler = new HttpRequestExecutingMessagehandler (expr, template);
handler.setExpectedResponseType(byte[].class);
handler.setHttpMethod(GET);
return handler;
}
Unfortunately this won't scale meaning for larger files it will eventually throw java.lang.OutOfMemoryError: Java heap space, even if i add more memory with -Xmx or -XXMaxPermSize
So my question is, what to do in order to avoid these problems no matter how big the files will be?

I think I have answered you in some other similar your question that Spring RestTemplate is not designed for streaming response body. It is explained in this SO thread: Getting InputStream with RestTemplate.
One of the solution which may work for your is to write a custom HttpMessageConverter which would return a File object containing data from HTTP response. This article explains how to do that with the ResponseExtractor, but something like FileHttpMessageConverter is not so hard to implement based on experience from that article. See StreamUtils.copy(InputStream in, OutputStream out)
Then you inject this FileHttpMessageConverter into your HttpRequestExecutingMessagehandler - setMessageConverters(List<HttpMessageConverter<?>> messageConverters).
Your service for remote shared drive should already deal with this local temporary file to get that large content without consuming memory.
See also this one about possible approach via WebFlux: https://www.amitph.com/spring-webclient-large-file-download/

Created this starting from ByteArrayHttpMessageConverter class and injected it into the custom RestTemplate I use. But this solution is based on using a File message, which is not quite the streaming I was hoping for.
public class FileCustomConverter extends AbstractHttpMessageConverter<File> {
public FileCustomConverter() {
super(new MediaType[]{MediaType.APPLICATION_OCTET_STREAM, MediaType.ALL});
}
public boolean supports(Class<?> clazz) {
return File.class == clazz;
}
public File readInternal(Class<? extends File> clazz, HttpInputMessage inputMessage) throws IOException {
File outputFile = File.createTempFile(UUID.randomUUID().toString(), ".tmp");
OutputStream outputStream = new FileOutputStream(outputFile);
StreamUtils.copy(inputMessage.getBody(), outputStream);
outputStream.close();
return outputFile;
}
protected Long getContentLength(File bytes, #Nullable MediaType contentType) {
return bytes.length();
}
protected void writeInternal(File file, HttpOutputMessage outputMessage) throws IOException {
InputStream inputStream = new FileInputStream(file);
StreamUtils.copy(inputStream, outputMessage.getBody());
inputStream.close();
}
}

Related

How to configure Spring DataBuffer size on WebFilter

I'm getting a gzipped content from the client and I need to decompress it before it reaches the controller, otherwise I get a jackson parsing exception.
I created a WebFilter that wraps the request and maps the body into a deflated byte array like this:
#Override
public Flux<DataBuffer> getBody() {
return request.getBody().map(requestDataBuffer -> {
try {
GZIPInputStream gzipInputStream = new GZIPInputStream(requestDataBuffer.asInputStream());
StringWriter writer = new StringWriter();
IOUtils.copy(gzipInputStream, writer, UTF_8);
byte[] targetArray = writer.toString().getBytes();
return new DefaultDataBufferFactory().wrap(targetArray);
}
catch (IOException e) {
LOG.error("failed to create gzip input stream. content-encoding is {}", request.getHeaders().getFirst(CONTENT_ENCODING));
return requestDataBuffer;
}
});
}
However, when the request body is too large the data buffer doesn't contain all the data, therefore I get stream exceptions.
Any ideas how to configure the data buffer or how to accept gzipped content?
I think the best way is to rely on the Netty implementation for that, and configure the server to use that support from Netty.
You can create a component (or return a new instance of this directly from a #Bean method) that customizes the Reactor Netty server:
#Component
public class RequestInflateCustomizer implements NettyServerCustomizer {
#Override
public HttpServer apply(HttpServer httpServer) {
return httpServer.tcpConfiguration(
tcp -> tcp.doOnConnection(conn -> conn.addHandlerFirst(new HttpContentDecompressor())));
}
}

Spring Boot MVC to allow any kind of content-type in controller

I have a RestController that multiple partners use to send XML requests. However this is a legacy system that it was passed on to me and the original implementation was done in a very loose way in PHP.
This has allowed to clients, that now they refuse to change, to send different content-types (application/xml, text/xml, application/x-www-form-urlencoded) and it has left me with the need to support many MediaTypes to avoid returning 415 MediaType Not Supported Errors.
I have used the following code in a configuration class to allow many media types.
#Bean
public MarshallingHttpMessageConverter marshallingMessageConverter() {
MarshallingHttpMessageConverter converter = new MarshallingHttpMessageConverter();
converter.setMarshaller(jaxbMarshaller());
converter.setUnmarshaller(jaxbMarshaller());
converter.setSupportedMediaTypes(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM, MediaType.APPLICATION_XML,
MediaType.TEXT_XML, MediaType.TEXT_PLAIN, MediaType.APPLICATION_FORM_URLENCODED, MediaType.ALL));
return converter;
}
#Bean
public Jaxb2Marshaller jaxbMarshaller() {
Jaxb2Marshaller marshaller = new Jaxb2Marshaller();
marshaller.setClassesToBeBound(CouponIssuedStatusDTO.class, CouponIssuedFailedDTO.class,
CouponIssuedSuccessDTO.class, RedemptionSuccessResultDTO.class, RedemptionResultHeaderDTO.class,
RedemptionFailResultDTO.class, RedemptionResultBodyDTO.class, RedemptionDTO.class, Param.class,
ChannelDTO.class, RedeemRequest.class);
Map<String, Object> props = new HashMap<>();
props.put(javax.xml.bind.Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setMarshallerProperties(props);
return marshaller;
}
The controller method is this:
#PostMapping(value = "/request", produces = { "application/xml;charset=UTF-8" }, consumes = MediaType.ALL_VALUE)
public ResponseEntity<RedemptionResultDTO> request(
#RequestHeader(name = "Content-Type", required = false) String contentType,
#RequestBody String redeemRequest) {
return requestCustom(contentType, redeemRequest);
}
This endpoint is hit by all clients. It is only one last client giving me trouble. They are sending content-type = application/x-www-form-urlencoded; charset=65001 (UTF-8)": 65001 (UTF-8)
Due to the way the charset is sent, Spring Boot refuses to return anything but 415. Not even MediaType.ALL seems to have any effect.
Is there a way to make Spring allow this to reach me ignoring the content-type? Creating a filter and changing the content type was not feasible since the HttpServletRequest is not allowing to mutate the content-type. I am out of ideas but I really think there has to be a way to allow custom content-types.
UPDATE
If I remove the #RequestBody then I don't get the error 415 but I have no way to get the request body since the HttpServletRequest reaches the Controller action empty.
You best case is to remove the consumes argument from the RequestMapping constructor. The moment you have it added, spring will try to parse it into known type MediaType.parseMediaType(request.getContentType()) & which tries to create a new MimeType(type, subtype, parameters) and thus throws exception due to invalid charset format being passed.
However, if you remove the consumes, and you wanna validate/restrict the incoming Content-Type to certain type, you can inject HttpServletRequest in your method as parameter, and then check the value of request.getHeader(HttpHeaders.CONTENT_TYPE).
You also have to remove the #RequestBody annotation so Spring doesn't attempt to parse the content-type in attempt to unmarshall the body. If you directly attempt to read the request.getInputStream() or request.getReader() here, you will see null as the stream has already been read by Spring. So to get access to input content, use spring's ContentCachingRequestWrapper inject using Filter and then you can later repeatedly read the content as it's cached & not reading from original stream.
I am including some code snippet here for reference, however to see executable example, you can refer my github repo. Its a spring-boot project with maven, once you launch it, you can send your post request to http://localhost:3007/badmedia & it will reflect you back in response request content-type & body. Hope this helps.
#RestController
public class BadMediaController {
#PostMapping("/badmedia")
#ResponseBody
public Object reflect(HttpServletRequest request) throws IOException {
ObjectMapper mapper = new ObjectMapper();
JsonNode rootNode = mapper.createObjectNode();
((ObjectNode) rootNode).put("contentType", request.getHeader(HttpHeaders.CONTENT_TYPE));
String body = new String(((ContentCachingRequestWrapper) request).getContentAsByteArray(), StandardCharsets.UTF_8);
body = URLDecoder.decode(body, StandardCharsets.UTF_8.name());
((ObjectNode) rootNode).put("body", body);
return mapper.writerWithDefaultPrettyPrinter().writeValueAsString(rootNode);
}
}
#Component
public class CacheRequestFilter extends GenericFilterBean {
#Override
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain)
throws IOException, ServletException {
HttpServletRequest cachedRequest
= new ContentCachingRequestWrapper((HttpServletRequest) servletRequest);
//invoke caching
cachedRequest.getParameterMap();
chain.doFilter(cachedRequest, servletResponse);
}
}

Spring Boot Callback after client receives resource?

I'm creating an endpoint using Spring Boot which executes a combination of system commands (java.lang.Runtime API) to generate a zip file to return to the client upon request, here's the code.
#GetMapping(value = "generateZipFile")
public ResponseEntity<Resource> generateZipFile(#RequestParam("id") Integer id) throws IOException {
org.springframework.core.io.Resource resource = null;
//generate zip file using commandline
resource = service.generateTmpResource(id);
return ResponseEntity.ok()
.header(HttpHeaders.CONTENT_TYPE, "application/zip")
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"randomFile.zip\"")
.body(resource);
//somehow delete generated file here after client receives it
}
I cannot keep stacking up the files on the server for obvious disk limit reasons, so I'm looking for a way to delete the files as soon as the client downloads them. Is there a solution in Spring Boot for this? I basically need to hook a callback that would do the cleanup after the user receives the resource.
I'm using Spring Boot 2.0.6
You can create a new thread but a best solution would be create a ThreadPoolExecutor in order to manage threads or also Scheduled annotation helps us.
new Thread() {
#Override
public void run() {
service.cleanup(id);
}
}.start();
UPDATED
A best answer, it would be using a Stack combine with Thread.
Here is the solution that I've done.
https://github.com/jjohxx/example-thread
I ended up using a HandlerInterceptorAdapter, afterCompletion was the callback I needed. The only challenge I had to deal with was passing through the id of the resource to cleanup, which I handled by adding a header in my controller method:
#GetMapping(value = "generateZipFile")
public ResponseEntity<Resource> genereateZipFile(#RequestParam("id") Integer id,
RedirectAttributes redirectAttributes) throws IOException {
org.springframework.core.io.Resource resource = myService.generateTmpResource(id);;
return ResponseEntity.ok()
.header(HttpHeaders.CONTENT_TYPE, "application/zip")
.header(MyInterceptor.TMP_RESOURCE_ID_HEADER, id.toString())
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"someFile.zip\"")
.body(resource);
}
The interceptor code:
#Component
public class MyInterceptor extends HandlerInterceptorAdapter {
public static final String TMP_RESOURCE_ID_HEADER = "Tmp-ID";
#Autowired
private MyService myService;
#Override
public void afterCompletion(HttpServletRequest request,
HttpServletResponse response,
Object handler,
Exception ex) {
if(response == null || !response.containsHeader(TMP_RESOURCE_ID_HEADER)) return;
String tmpFileId = response.getHeader(TMP_RESOURCE_ID_HEADER);
myService.cleanup(tmpFileId);
}
}
For more information about interceptors see here.

Streaming upload via #Bean-provided RestTemplateBuilder buffers full file

I'm building a reverse-proxy for uploading large files (multiple gigabytes), and therefore want to use a streaming model that does not buffer entire files. Large buffers would introduce latency and, more importantly, they could result in out-of-memory errors.
My client class contains
#Autowired private RestTemplate restTemplate;
#Bean
public RestTemplate restTemplate(RestTemplateBuilder restTemplateBuilder) {
int REST_TEMPLATE_MODE = 1; // 1=streams, 2=streams, 3=buffers
return
REST_TEMPLATE_MODE == 1 ? new RestTemplate() :
REST_TEMPLATE_MODE == 2 ? (new RestTemplateBuilder()).build() :
REST_TEMPLATE_MODE == 3 ? restTemplateBuilder.build() : null;
}
and
public void upload_via_streaming(InputStream inputStream, String originalname) {
SimpleClientHttpRequestFactory requestFactory = new SimpleClientHttpRequestFactory();
requestFactory.setBufferRequestBody(false);
restTemplate.setRequestFactory(requestFactory);
InputStreamResource inputStreamResource = new InputStreamResource(inputStream) {
#Override public String getFilename() { return originalname; }
#Override public long contentLength() { return -1; }
};
MultiValueMap<String, Object> body = new LinkedMultiValueMap<String, Object>();
body.add("myfile", inputStreamResource);
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.MULTIPART_FORM_DATA);
HttpEntity<MultiValueMap<String, Object>> requestEntity = new HttpEntity<>(body,headers);
String response = restTemplate.postForObject(UPLOAD_URL, requestEntity, String.class);
System.out.println("response: "+response);
}
This is working, but notice my REST_TEMPLATE_MODE value controls whether or not it meets my streaming requirement.
Question: Why does REST_TEMPLATE_MODE == 3 result in full-file buffering?
References:
How to forward large files with RestTemplate?
How to send Multipart form data with restTemplate Spring-mvc
Spring - How to stream large multipart file uploads to database without storing on local file system -- establishing the InputStream
How to autowire RestTemplate using annotations
Design notes and usage caveats, also: restTemplate does not support streaming downloads
In short, the instance of RestTemplateBuilder provided as an #Bean by Spring Boot includes an interceptor (filter) associated with actuator/metrics -- and the interceptor interface requires buffering of the request body into a simple byte[].
If you instantiate your own RestTemplateBuilder or RestTemplate from scratch, it won't include this by default.
I seem to be the only person visiting this post, but just in case it helps someone before I get around to posting a complete solution, I've found a big clue:
restTemplate.getInterceptors().forEach(item->System.out.println(item));
displays...
org.SF.boot.actuate.metrics.web.client.MetricsClientHttpRequestInterceptor
If I clear the interceptor list via setInterceptors, it solves the problem. Furthermore, I found that any interceptor, even if it only performs a NOP, will introduce full-file buffering.
public class SimpleClientHttpRequestFactory { ...
I have explicitly set bufferRequestBody = false, but apparently this code is bypassed if interceptors are used. This would have been nice to know earlier...
#Override
public ClientHttpRequest createRequest(URI uri, HttpMethod httpMethod) throws IOException {
HttpURLConnection connection = openConnection(uri.toURL(), this.proxy);
prepareConnection(connection, httpMethod.name());
if (this.bufferRequestBody) {
return new SimpleBufferingClientHttpRequest(connection, this.outputStreaming);
}
else {
return new SimpleStreamingClientHttpRequest(connection, this.chunkSize, this.outputStreaming);
}
}
public abstract class InterceptingHttpAccessor extends HttpAccessor { ...
This shows that the InterceptingClientHttpRequestFactory is used if the list of interceptors is not empty.
/**
* Overridden to expose an {#link InterceptingClientHttpRequestFactory}
* if necessary.
* #see #getInterceptors()
*/
#Override
public ClientHttpRequestFactory getRequestFactory() {
List<ClientHttpRequestInterceptor> interceptors = getInterceptors();
if (!CollectionUtils.isEmpty(interceptors)) {
ClientHttpRequestFactory factory = this.interceptingRequestFactory;
if (factory == null) {
factory = new InterceptingClientHttpRequestFactory(super.getRequestFactory(), interceptors);
this.interceptingRequestFactory = factory;
}
return factory;
}
else {
return super.getRequestFactory();
}
}
class InterceptingClientHttpRequest extends AbstractBufferingClientHttpRequest { ...
The interfaces make it clear that using InterceptingClientHttpRequest requires buffering body to a byte[]. There is not an option to use a streaming interface.
#Override
public ClientHttpResponse execute(HttpRequest request, byte[] body) throws IOException {

Can I use Spring WebFlux to implement REST services which get data through Kafka request/response topics?

I'm developing REST service which, in turn, will query slow legacy system so response time will be measured in seconds. We also expect massive load so I was thinking about asynchronous/non-blocking approaches to avoid hundreds of "servlet" threads blocked on calls to slow system.
As I see this can be implemented using AsyncContext which is present in new servlet API specs. I even developed small prototype and it seems to be working.
On the other hand it looks like I can achieve the same using Spring WebFlux.
Unfortunately I did not find any example where custom "backend" calls are wrapped with Mono/Flux. Most of the examples just reuse already-prepared reactive connectors, like ReactiveCassandraOperations.java, etc.
My data flow is the following:
JS client --> Spring RestController --> send request to Kafka topic --> read response from Kafka reply topic --> return data to client
Can I wrap Kafka steps into Mono/Flux and how to do this?
How my RestController method should look like?
Here is my simple implementation which achieves the same using Servlet 3.1 API
//took the idea from some Jetty examples
public class AsyncRestServlet extends HttpServlet {
...
#Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
String result = (String) req.getAttribute(RESULTS_ATTR);
if (result == null) { //data not ready yet: schedule async processing
final AsyncContext async = req.startAsync();
//generate some unique request ID
String uid = "req-" + String.valueOf(req.hashCode());
//share it to Kafka receive together with AsyncContext
//when Kafka receiver will get the response it will put it in Servlet request attribute and call async.dispatch()
//This doGet() method will be called again and it will send the response to client
receiver.rememberKey(uid, async);
//send request to Kafka
sender.send(uid, param);
//data is not ready yet so we are releasing Servlet thread
return;
}
//return result as html response
resp.setContentType("text/html");
PrintWriter out = resp.getWriter();
out.println(result);
out.close();
}
Here's a short example - Not the WebFlux client you probably had in mind, but at least it would enable you to utilize Flux and Mono for asynchronous processing, which I interpreted to be the point of your question. The web objects should work without additional configurations, but of course you will need to configure Kafka as the KafkaTemplate object will not work on its own.
#Bean // Using org.springframework.web.reactive.function.server.RouterFunction<ServerResponse>
public RouterFunction<ServerResponse> sendMessageToTopic(KafkaController kafkaController){
return RouterFunctions.route(RequestPredicates.POST("/endpoint"), kafkaController::sendMessage);
}
#Component
public class ResponseHandler {
public getServerResponse() {
return ServerResponse.ok().body(Mono.just(Status.SUCCESS), String.class);
}
}
#Component
public class KafkaController {
public Mono<ServerResponse> auditInvalidTransaction(ServerRequest request) {
return request.bodyToMono(TopicMsgMap.class)
// your HTTP call may not return immediately without this
.subscribeOn(Schedulers.single()) // for a single worker thread
.flatMap(topicMsgMap -> {
MyKafkaPublisher.sendMessages(topicMsgMap);
}.flatMap(responseHandler::getServerResponse);
}
}
#Data // model class just to easily convert the ServerRequest (from json, for ex.)
// + ~#constructors
public class TopicMsgMap() {
private Map<String, String> topicMsgMap;
}
#Service // Using org.springframework.kafka.core.KafkaTemplate<String, String>
public class MyKafkaPublisher {
#Autowired
private KafkaTemplate<String, String> template;
#Value("${topic1}")
private String topic1;
#Value("${topic2}")
private String topic2;
public void sendMessages(Map<String, String> topicMsgMap){
topicMsgMap.forEach((top, msg) -> {
if (topic.equals("topic1") kafkaTemplate.send(topic1, message);
if (topic.equals("topic2") kafkaTemplate.send(topic2, message);
});
}
}
Guessing this isn't the use-case you had in mind, but hope you find this general structure useful.
There is several approaches including KafkaReplyingRestTemplate for this problem but continuing your approach in servlet api's the solution will be something like this in spring Webflux.
Your Controller method looks like this:
#RequestMapping(path = "/completable-future", method = RequestMethod.POST)
Mono<Response> asyncTransaction(#RequestBody RequestDto requestDto, #RequestHeader Map<String, String> requestHeaders) {
String internalTransactionId = UUID.randomUUID().toString();
kafkaSender.send(Request.builder()
.transactionId(requestHeaders.get("transactionId"))
.internalTransactionId(internalTransactionId)
.sourceIban(requestDto.getSourceIban())
.destIban(requestDto.getDestIban())
.build());
CompletableFuture<Response> completableFuture = new CompletableFuture();
taskHolder.pushTask(completableFuture, internalTransactionId);
return Mono.fromFuture(completableFuture);
}
Your taskHolder component will be something like this:
#Component
public class TaskHolder {
private Map<String, CompletableFuture> taskHolder = new ConcurrentHashMap();
public void pushTask(CompletableFuture<Response> task, String transactionId) {
this.taskHolder.put(transactionId, task);
}
public Optional<CompletableFuture> remove(String transactionId) {
return Optional.ofNullable(this.taskHolder.remove(transactionId));
}
}
And finally your Kafka ResponseListener looks like this:
#Component
public class ResponseListener {
#Autowired
TaskHolder taskHolder;
#KafkaListener(topics = "reactive-response-topic", groupId = "test")
public void listen(Response response) {
taskHolder.remove(response.getInternalTransactionId()).orElse(
new CompletableFuture()).complete(response);
}
}
In this example I used internalTransactionId as CorrelationId but you can use "kafka_correlationId" that is a known kafka header.

Resources