Golang http write response without waiting to finish - go

I'm building an application that builds a pdf file and returns it to the client whenever it receives a request.
Since some of these pdf files might take some time to generate, I would like to periodically send some sort of status update back to client while it is running.
When it's finished building the pdf file, it should be returned to the client as well.
Something akin to:
func buildReport(writer http.ResponseWriter, request *http.Request){
//build pdf build pdf file
for { //for example purposes only
writer.Write([]byte("building. Please wait."))
}
pdf.OutputFileAndClose("report.pdf")
//set header to pdf so that the client knows it's a PDF
writer.Header().Set("Content-Type", "application/pdf")
http.ServeFile(writer, request, "report.pdf")
}
func main() {
http.HandleFunc("/", buildReport)
http.ListenAndServe(":8081", nil)
}
Setting the header might not work, as the writer can only have one header.

TL;DR is that it cannot be implemented that way. You need to
An API that requests the PDF creation. That queues PDF creation job in a task queue (so that too many PDF creation requests won't blow the HTTP server worker pool)
Provide an API that allows you to check where are you with the PDF rendering (I am assuming that the job can provide interim stats). This is going to be polled by the client on a regular basis.
An API to pull the PDF once it is ready.
Hope this helps and best of luck with your project.

This is by no means comprehensive, but a reasonable example of how you might construct your API (which needs to be asynchronous, as the previous respondent pointed out) can be found here: https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/
The job queue model is a pretty common one. I would recommend you also write a basic API binding library (you'd want this for your own testing purposes in any case) so that your users can understand how you intend them to use the API, and in writing it, you'll get a better sense of how asynchronous REST interactions feel from the end user side.

Contrary to what others have said, what you want is in fact
directly possible but requires fullfillment of the two preconditions:
HTTP/1.1 and above.
You'll be sending custom content to the clients — not PDF data
directly, — and they're prepared to accept and parse it.
You can then employ the so-called "chunked" payload encoding specifically
invented to handle "streamed" downloads where the server does not know how
many bytes it's about to send.
So you may invent some creative kind of payload where you first periodically
stream a "no op" / "progress" marker and then the actual payload.
Say, while the file is being prepared you periodically send a line of text
reading "PROCESSING" + LF then, when a result is ready you send
a line of text "READY" SIZE + LF where SIZE is the size, in bytes,
of the immediately following PDF document. After the document is streamed,
the server signals the end of data.
Hence the stream would look like
PROCESSING
PROCESSING
…
PROCESSING
READY 8388608
%PDF-1.3
…
%%EOF
The clients have to be able to parse this information from the stream
they're receiving and have a simple FSM in place to switch from state to
state as they fetch your stream.
The server has to make sure it flushes the stream after each "informational" line otherwise the whole thing would not be "interactive".
If you have a good idea about the overall state of the processing of the
document, each "status update" line could include the percentage of the work done, like in "PROCESSINGNN" + LF.

Related

Socket.io - different maxHttpBufferSize values depending on the nature of the request

I am creating an application that allows users to submit JSON or Base64 image data via socket.io
The goal I am trying to achieve is:
if JSON is submitted, the message can have a maximum size of 1MB
if a Base64 image is submitted, the message can have a maximum size of 5MB
From the socket.io docs I can see that:
you can specify a maxHttpBufferSize option value that allows you to limit the maximum message size
namespaces allow you to split logic over a single connection
However, I can't figure out the correct way to get the functionality to work the way I have described above.
Would I need to:
set up 2 separate io instances on the server, one for JSON data and the other for Base64 images (therefore allowing me to set separate maxHttpBufferSize values for each), and then the client can use the correct instance, depending on what they want to submit (if so, what is the correct way of doing this?)
set up 1 instance with a maxHttpBufferSize of 5MB, and then add in my own custom logic to determine message sizes and prevent further actions if the data is JSON and over 1MB in size
set this up in some totally different way that I haven't thought of
Many thanks
From what I can see in the API, maxHttpBufferSize is a parameter for the underlying Engine.IO server (of which there is one instance per Socket.IO Server Instance). Obviously you're free to set up two servers but I doubt it makes sense to separate the system into two entirely different applications.
Talk of using Namespaces to separate logic is more about handling different messages at different endpoints (for example you would register a removeUserFromChat message handler to a user connecting via an /admin namespace, but you wouldn't want to register this to a user connecting via the /user namespace).
In the most recent socket server I set up, I defined my own protocol where part of the response would contain a HTTP status code, as well as a description that could be displayed to the user. For example I would return 200 on success. If I was uploading a file via a REST HTTP Interface, I would expect a 400 (BAD REQUEST) response if my request couldn't be processed - and I believe that this makes sense for your use case. Alternatively you could define your own custom 4XX error code if the file is too large, and handle this in your UI purely based on the code returned. Obviously you don't need to follow the HTTP protocol, and the design decisions are ultimately up to you, but in my opinion it makes sense to return some kind of error response in your message handler.
I suspect that the maxHttpBufferSize has different use at lower levels than your use case. When sending content over network, content is split into 'n bytes' of packets and when a application writes 'n' bytes, the network sends a packet over network (the less the n, more overhead due to network headers. The more the n, high latency because of waiting involved in accumulating n bytes before sending). Documentation is not clear about maxHttpBufferSize but it could be the packet size (n) configuration, not limit on the max data on connection.
It seems, http request header Content-Length might serve your purpose. It gives the actual object size based on that you can make a decision.

JMeter encoding issue on "application/soap+msbin1"

Working on JMeter and trying to send the soap request to server and shows the below error msg.
Error Msg:- Cannot process the message because the content type 'application/soap+msbin1' was not the expected type 'application/xml; charset=utf-8'.
We need help to encode XML to 'application/soap+msbin1' format.
Bit late to the party, but I encountered a similar issue - I had a template for SOAP request which uses embedded-binary XML (xop:Include cid="...") and had to scratch my head to figure out how to do that with the stock HTTP Request.
The answer: you can't - not in a simple way. To solve the issue, I ended up customizing JMeter (I also looked at HTTPRawRequest as well but it doesn't seem to support https and I would have to rewrite a lot of the test script to use that). Since HTTP request does 99% of the job, the quickest way to support binary data is to change the source code to handle binary data.
The main issues are two: the Function interface in JMeter is designed around returning String, not byte[]. So already __FileToString() (which I used to read an external binary file to use) encodes the content of the file . Secondly, the HTTP Request Sampler and HTTPHC4Impl itself (excluding the "upload file" bit) encodes the parts of the HTTP request before sending it over to the wire.
Changing that implied changes in Function, AbstractFunction, CompoundVariable and create a new function class FileToStringBinary which encode the binary data in a way that it can be decoded after (by changes made to HTTPHC4Impl).
If I have the time I'll find someplace where to post the idea and the source (can't submit to JMeter because my update to HTTPHC4Impl is limited to handle the specific requests I need to test, where the embedded binary is in a multipart/related part, and I have no time or inclination to handle the general cases), but if you still need help to make it work, drop a line.

rest api design and workflow to upload images.

I want to design an api that allows clients to upload images, then the application creates different variants of the images, like resizing or changing the image format, finally the application stores the image information for each of the variants in a database. The problem occurs when I try to determine the proper strategy to implement this task, here are some different strategies i can think of.
Strategy 1:
Send a post request to /api/pictures/,
create all the image variants and return 201 created if all image files were created correctly and the image information was saved to the database, otherwise it returns a 500 error.
pros: easy to implement
cons: the client has to wait a very long time until all variants of the images are created.
Strategy 2:
Send a post request to /api/pictures/, create just the necessary information for the image variants and store it in the database, then returns a 202 accepted, and start creating the actual image variant files, the 202 response includes a location header with a new url, something like /api/pictures/:pictureId/status to 'monitor' the state of the image variants creation process. The client could use this url to check whether the process was completed or not, if the process was completed return a 201 created, if the process is pending return a 200 ok, if there is an error during the process, it ends and returns a 410 gone
pros: the client gets a very fast response, and it doesn't have to wait until all image variants are created.
cons: hard to implement server side logic, the client has to keep checking the returned location url in order to know when the process has finished.
Another problem is that, for example when the image variants are created correctly but one fails, the entire process returns a 410 gone, the client can keep sending requests to the status url because the application will try to create the failed image again, returning a 201 when its end correctly.
Strategy 3:
This is very similar to strategy 2 but instead of return a location for the whole 'process', it returns an array of locations with status urls for each image variant, this way the client can check the status for each individual image variant instead of the status of the whole process.
pros: same as strategy 2, if one image variant fails during creation, the other variants are not affected. For example, if one of the variants fails during creation it returns a 410 gone while the images that were created properly returns a 201 created.
cons: the client is hard to implement because it has to keep track of an array of locations instead of just one location, the number of requests increases proportionally to the number of variants.
My question is what is the best way to accomplish this task?
Your real problem is how to deal with asynchronous requests in HTTP. My approach to that problem is usually to adopt option 2, returning 202 Accepted and allowing the client to check current status with GET on the Location URI if he wants to.
Optionally, the client can provide a callback URI on a request header, which I will use to notify completion.

Ruby Sockets and parallel event handling

I'm writing a library that can interact with a socket server that transmits data as events to certain actions my library sends it.
I created an Actions module that formats the actions so that the server can read it. It also generates an action_id, because the events parser can identify it with the action that sent it. There are more than one event per action possible.
While I'm sending my action to the server, the event parser is still getting data from the server, so they work independent from each other (but then again they do work together: events response aggregator triggers the action callback).
In my model, I want to get a list of some resource from the server. The server sends its data one line at a time, but that's being handled by the events aggregator, so don't worry about that.
Okay, my problem:
In my model I am requesting the resources, but since the events are being parsed in another thread, I need to do a "infinite" loop that checks if the list is filled, and then break out to return it to the consumer of the model (e.g. my controller).
Is there another (better) way of doing this or am I on the right track? I would love your thoughts :)
Here is my story in code: https://gist.github.com/anonymous/8652934
Check out Ruby EventMachine.
It's designed to simplify this sort of reactor pattern application.
It depends on the implementation. In the code you provide you're not showing how actually the request and responses are processed.
If you know exactly the number of responses you're supposed to receive, in each one you could check if all are completed, then execute an specific action. e.g.
# suppose response_receiver is the method which receives the server response
def response_receiver data
#responses_list << data
if #response_list.size == #expected_size
# Execute some action
end
end

Large number of concurrent ajax calls and ways to deal with it

I have a web page which, upon loading, needs to do a lot of JSON fetches from the server to populate various things dynamically. In particular, it updates parts of a large-ish data structure from which I derive a graphical representation of the data.
So it works great in Chrome; however, Safari and Firefox appear to suffer somewhat. Upon the querying of the numerous JSON requests, the browsers become sluggish and unusable. I am under the assumption that this is due to the rather expensive iteration of said data structure. Is this a valid assumption?
How can I mitigate this without changing the query language so that it's a single fetch?
I was thinking of applying a queue that could limit the number of concurrent Ajax queries (and hence also limit the number of concurrent updates to the data structure)... Any thoughts? Useful pointers? Other suggestions?
In browser-side JS, create a wrapper around jQuery.post() (or whichever method you are using)
that appends the requests to a queue.
Also create a function 'queue_send' that will actually call jQuery.post() passing the entire queue structure.
On server create a proxy function called 'queue_receive' that replays the JSON to your server interfaces as though it came from the browser, collects the results into a single response, sends back to browser.
Browser-side queue_send_success() (success handler for queue_send) must decode this response and populate your data structure.
With this, you should be able to reduce your initialization traffic to one actual request, and maybe consolidate some other requests on your website as well.
in particular, it updates parts of a largish data structure from which i derive a graphical representation of the data.
I'd try:
Queuing responses as they come in, then update the structure once
Hiding the representation invisible until the responses are in
Magicianeer's answer is also good - but I'm not sure if it fits your definition of "without changing the query language so that it's a single fetch" - it would avoid re-engineering existing logic.

Resources