Storing a (possibly large) file between requests in Spring - spring

I have this controller methods that depending on the parameters introduced by the user downloads a certain PDF file and shows a view with its different pages converted to PNG.
So the way I approached it works like this:
First I map a method to receive the post data sent by the user, then generate the URL of the actual PDF converter and pass it to the model:
#RequestMapping(method = RequestMethod.POST)
public String formPost(Model model, HttpServletRequest request) {
//Gather parameters and generate PDF url
Long idPdf = Long.parseLong(request.getParam("idPdf"));
//feed the jsp the url of the to-be-generated image
model.addAttribute("image", "getImage?idPdf=" + idPdf);
}
Then in getImageMethod I download the PDF and then generate a PNG out of it:
#RequestMapping("/getImage")
public HttpEntity<byte[]> getPdfToImage(#RequestParam Long idPdf) {
String url = "myPDFrepository?idPDF=" + idPdf;
URL urlUrl = new URL(url);
URLConnection urlConnection;
urlConnection = urlUrl.openConnection();
InputStream is = urlConnection.getInputStream();
return PDFtoPNGConverter.convert(is);
}
My JSP just has an img tag that refers to this url:
<img src="${image}" />
So far this work perfectly. But now I need to allow the possibility of viewing multi page PDFs, converted as PNGS, each of them in a different page. So I would add a page parameter, then feed my model with the image url including that page parameter, and in my getImage method I would convert only that page.
But the way it is implemented, I would be downloading the PDF again for each page, plus an additional time for the view, so it can find out whether this specific PDF has more pages and then show the "prev" and "next" buttons.
What would be a good way to preserve the same file during these requests, so I download it just once? I thought about using temp files but then managing its deletion might be a problem. So maybe storing the PDF in the session would be a good solution? I don't even know if this is good practice or not.
I am using Spring MVC by the way.

I think the simplest way would be using spring cache abstraction. Look at tutorial and will need to change your code a little: move logic that load pdf to separate class.
it will looks like:
interface PDFRepository {
byte[] getImage(long id);
}
#Repository
public class PDFRepositoryImpl implements PDFRepository {
#Cacheable
public byte[] getImage(long id) {
String url = "myPDFrepository?idPDF=" + idPdf;
URL urlUrl = new URL(url);
URLConnection urlConnection;
urlConnection = urlUrl.openConnection();
InputStream is = urlConnection.getInputStream();
return PDFtoPNGConverter.convert(is);
}
}
You will get pluggable cache implementation support and good cache expiration management.

Related

Tomcat Performance with Spring Boot API for File Upload

I have a Spring boot API and one of the endpoints allows users to upload video's. Now My controller basically takes the file as a MultiPart file and then I store it in a temp folder accessible to tomcat. Once I have it stored on Disk, I then push the video to an S3 bucket.
Now to me anyway, this seems to be less than optimal, Like if I wanted to have a 100 or a 1000 users upload at once it seems really non performant to write the files to disk first.
As a little background I'm storing it on disk with the intention that if there is a issue pushing to S3 I can retry
The below code might show what I'm doing better than the above:
public Video addVideo(#RequestParam("title") String title,
#RequestParam("Description") String Description,
#RequestParam(value = "file", required = true) MultipartFile file) {
this.amazonS3ClientService.uploadFileToS3Bucket(file, title, description));
}
Method for storing Video file:
String fileNameWithExtenstion = awsS3FileName + "." + FilenameUtils.getExtension(multipartFile.getOriginalFilename());
//creating the file in the server (temporarily)
File file = new File(tomcatTempDir + fileNameWithExtenstion);FileOutputStream fos = new FileOutputStream(file);
fos.write(multipartFile.getBytes());
fos.close();PutObjectRequest putObjectRequest = new PutObjectRequest(this.awsS3Bucket, awsS3BucketFolder + UnigueId + "/" + fileNameWithExtenstion, file);
if (enablePublicReadAccess) {
putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}
// Upload a file as a new object with ContentType and title
specified.amazonS3.putObject(putObjectRequest);
//removing the file created in the server
file.delete();
So my question is....is there a better way in Tomcat to:
A) Take in a file via a controllerB) Push to S3
There is no other way to do it with multipart. The problem with multipart that to properly segement parts from the requst they need sometimes skipped or be repeatable. That is impossible within memory w/o having memory to explode. Therefore, Commons FileUpload caches them on disk after a certain threshold is reached.
Multipart requests are the worst way for that. I highly recommend to use either PUT or POST with content type application/octet-stream. You can take the bare request input stream and pass to HttpClient to stream to your backend server. I did this already 5 years ago and it works for gigabytes. I have posted the solution in the Apache HttpClient mailing list.
There is one possibility how this could work under specific conditions:
All parts are in the correct physical order you want to read
Your write to a backend is fast enough to sustain the read from the front
Consume the root part and then go over to the next physical one, process the request body lazily. JAX-WS RI (Metro) has a very nice handling of multipart requests for XOP/MTOM. Learn from that because you won't be able to make it any better.
Perhaps you can try to direct stream the input stream from your MultipartFile to S3.
Consider the following uploadFileToS3Bucket method:
public PutObjectResult uploadFileToS3Bucket(InputStream input, long size, String title, String description) {
// Indicate the length of the information to avoid the need to compute it by the AWS SDK
// See: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html#PutObjectRequest-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentLength(size); // rely on Spring implementation. Maybe you probably also can use input.available()
// compute the object name as appropriate
String key = "...";
PutObjectRequest putObjectRequest = new PutObjectRequest(
this.awsS3Bucket, key, input, objectMetadata
);
// The rest of your code
if (enablePublicReadAccess) {
putObjectRequest.withCannedAcl(CannedAccessControlList.PublicRead);
}
// Upload a file as a new object with ContentType and title
return specified.amazonS3.putObject(putObjectRequest);
}
Of course, you need to provide the service the input stream obtained from the client request associated with the MutipartFile object:
public Video addVideo(
#RequestParam("title") String title,
#RequestParam("Description") String Description,
#RequestParam(value = "file", required = true) MultipartFile file) {
try (InputStream input = file.getInputStream()) {
this.amazonS3ClientService.uploadFileToS3Bucket(input, file.getSize(), title, description));
}
}
Probably you can also play with the getBytes method of MultipartFile and create a ByteArrayInputStream to perform the operation.
In addVideo:
byte[] bytes = file.getBytes();
In uploadFileToS3Bucket:
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentLength(bytes.length);
PutObjectRequest putObjectRequest = new PutObjectRequest(
this.awsS3Bucket, key, new ByteArrayInputStream(bytes), objectMetadata
);
I would prefer the first solution, but try to determine which option offers you the best performance.

Retrieving image from Spring Controller to display in a Thymeleaf template

I am trying to display an image in a Thymeleaf template that is coming from a MySql Database.
Most of it is working but I have a problem with the controller code :
#GetMapping(value = "/image/{someId}", produces = {MediaType.IMAGE_JPEG_VALUE, MediaType.IMAGE_PNG_VALUE})
public ResponseEntity<byte[]> getPicture(#PathVariable("someId") long someId) throws IOException {
byte[] imageContent = //Getting the blob element from the Repo through a service
final HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.IMAGE_JPEG);
headers.set("Content-Disposition", "attachment; filename=fileName.jpg");
headers.setContentLength(imageContent.length);
return new ResponseEntity<byte[]>(imageContent, headers, HttpStatus.OK);
}
My code is similar to solutions that ware suggested here (see this answer or this one) but my template won't display the image as its content is most probably not valid. If I reach the endpoint directly from the browser, a "filename.jpg" image will download (makes sense following the code above) but that only contains the original name of the image if I open it with a text editor. It's not an image.
There's surely something that I am missing in the configuration of the ResponseEntity. Do you perhaps have some documentation I could read about the process because I don't understand fully what is going on.

Spring Multipart File with #RequestBody

I am trying to upload data from an app to a spring backend service.
Things to upload are a DataModel containing data of the object to create and several images linked to the data.
Therefore I am using this method signature:
#RequestMapping(method = RequestMethod.POST)
#ResponseBody
public Survey createSurvey(#RequestBody SurveyPostHelper helper, #RequestParam(value="file", required = true) MultipartFile[] images)
I tried to play with the annotations, but either I get a blank images array or my helper is empty.
How would you solve this?
Thanks in advance.
I found out, that this method signature could do the job:
#ResponseBody
public Survey createSurvey(#RequestPart(required=true) SurveyPostHelper helper, #RequestPart(value="file", required = true) final MultipartFile[] images)
Important in my case was to set the MimeType in the client app. The files MimeType should be image/jpg and the SurveyPostHelpers to application/json to allow Spring to parse the json and bind it to my Object.
see an example of the client code working for me images is the list of files I want to save
var formData = new FormData();
for (var i = 0; i < images.length ; i++) {
formData.append('images', images[i]);
}
formData.append('adData', new Blob([JSON.stringify(adData)], {
type: "application/json"
}));

Compression response filter fails on breeze.js Metadata call

I have an http module where I'm adding a response filter below for compression. This works for all API calls except for 1, the call to MetaData. If I remove the [BreezeController] decoration it works fine. I think it has to do with action filter attribute that converts the string return type into an HttpResponse return type with string content.
The error I'm getting is " Exception message: The stream state of the underlying compression routine is inconsistent."
I've done some testing where a method thats defined to return an HttpResponse works fine. So I think its the scenario where the method is defined to return string, and then the action filter changes it to HttpResponse at runtime.
Any ideas how I can get this to work?
Here's the response filter being added in BeginRequest:
HttpApplication app = (HttpApplication)sender;
// Check the header to see if it can accept compressed output
string encodings = app.Request.Headers.Get("Accept-Encoding");
if (encodings == null)
return;
Stream s = app.Response.Filter;
encodings = encodings.ToLower();
if (encodings.Contains("gzip"))
{
app.Response.Filter = new GZipStream(s, CompressionMode.Compress);
app.Response.AppendHeader("Content-Encoding", "gzip");
}
Don't know the specifics of what you're doing but I know that the [BreezeController] attribute strips out filters and adds back just the ones that breeze wants.
One approach might be to define a separate controller (ModelMetadataController) that only serves the metadata. This controller doesn't have the [BreezeController] attribute; it's a plain old Web API controller.
Then you create a "Breeze controller" (ModelController) with all of the usual methods except the Metadata method.
You call the metadata controller from the client during app launch via MetadataStore.fetchMetadata just to get metadata.
Once you have populated a metadataStore in this fashion, you use it in your EntityManager which sends query and save requests to the "real" Web API data controller.
The client code might look something like this:
var ds = new breeze.DataService({
serviceName: 'breeze/Model' // the breeze query & save controller
});
var ms = new MetadataStore({
namingConvention: breeze.NamingConvention.camelCase, // assuming that's what you want
});
ms.addDataService(ds); // associate the metadata-to-come with the "real" dataService
var manager = new breeze.EntityManager({
dataService: ds,
metadataStore: ms
});
// the fun bit: fetch the metadata from a different controller
var promise = ms.fetchMetadata('breeze/ModelMetadata') // the metadata-only controller!
return promise; // wait on it appropriately

Capture current JSF page content

I want to capture the current page and send it to an application that converts it to pdf.
This is what I am using:
FacesContext facesContext=FacesContext.getCurrentInstance();
HttpServletResponse response = (HttpServletResponse)
facesContext.getExternalContext().getResponse();
HttpServletRequest request = (HttpServletRequest) facesContext.getExternalContext().getRequest();
// RequestPrinter.debugString();
response.reset();
// download a pdf file
response.setContentType("application/pdf");
response.setHeader("Content-Disposition","attachment;filename="+new Date().toString()+".pdf");
prince.setVerbose(true);
prince.setLog(logFile);
try{
//getPath() to the page the user is currently on
URL pagePath=new URL(this.getPath());
URLConnection urlConnection = pagePath.openConnection();
urlConnection.setDoOutput(true);
int length = urlConnection.getContentLength();
//Lets use inputStream
BufferedInputStream bis=new BufferedInputStream(urlConnection.getInputStream());
response.setContentLength(length);
//this.getPageUsingJSoup().data().getBytes();
//call prince and pass params for inputstream outputStream
prince.convert(bis,response.getOutputStream());
urlConnection.getInputStream().close();
}catch(MalformedURLException mu){
mu.printStackTrace();
}
catch(IOException ioe){
ioe.printStackTrace();
}
facesContext.responseComplete();
Since the website requires authentication, the pdf generated is the loging error page.
Is there a way to capture the page's content that uses the current user's session?
Thank you in advance.
Just request the page in the same HTTP session as the current request. If your webapp supports URL rewriting (as by default), then just append session ID as jsessionid path fragment:
String sessionId = ((HttpSession) externalContext.getSession()).getId();
InputStream input = new URL("http://localhost:8080/context/page.jsf;jsessionid=" + sessionId).openStream();
// ...
Or if your webapp doesn't accept URL rewriting, but accepts cookies only, then set it as a request cookie the usual way:
URLConnection connection = new URL("http://localhost:8080/context/page.jsf").openConnection();
connection.setRequestProperty("Cookie", "JSESSIONID=" + sessionId);
InputStream input = connection.getInputStream();
// ...
Note that I removed setDoOutput() since you don't seem to be interested in performing a POST request.
I do not know how to capture the page's content using the current user's session, but I can suggest another way to do it - you could move the pdf conversion logic inside a Selenium test-case and use the test-case to navigate and login to the page requiring authentication. After the automated tc has logged in, you could call your pdf conversion logic...?
Yes of course there is. You are sending this content, so you have it. You should store the Content Object. If you dont have it, inspect your byte streams. The content should be there ;)
There of couple of websites which allow you to convert the entire page to pdf and save it as .pdf file. Try out the site http://pdfcrowd.com/ Hope this helps you.

Resources