Force concurrent chunks to go in order - fine-uploader

We've noticed that with concurrent chunking enabled, the first chunk isn't always sent first. Sometimes, the first chunk is sent last. How do we force the chunks to be sent in order while also being sent concurrently? At the very least, how do we force the first chunk to be sent first?

Your server should not necessarily care about the order of the chunks. Fine Uploader sends a POST when all chunks have completed, so your server knows when to combine them. The library does in fact send the requests in order, but it's possible that they are not arriving at your server in the order they were sent. This is not something Fine Uploader has any control over, and depending on the order of requests to arrive or complete in this case is a recipe for failure.


How to handle unsent data in microservices

I have two services A and B. A receives a request, does some processing and sends the processed data to B.
What should I do with the data in the following scenario:
A receives data.
Processes it successfully.
Crashes before sending the data to B.
Comes back online.
I would either use some sort of persistent log to handle the communication between the micro-services (e.g. Kafka) or some sort of retry mechanism.
In either case, the data that A received and processed must not disappear until the entire chain of execution completes successfully or, at the very least, until A has successfully completed its work and passed its payload to the next service. And this payload must exist until the next service processes it, and so on.
Generally, the steps should continue as follows:
A comes back online and sees that there is work to be done: the one that it processed at step #2 (since it's processing is not yet done as far as the overall system is concerned). Unless there are some weird side-effects, it shouldn't matter that it processes it again.
The data is sent to B (although this step should, conceptually, be part of "processing" the data).
If A crashes again then it probably means that the data it processes matches nicely with a bug in A and the whole chain of starting up, reprocessing and crashing will continue for ever. This is a Denial of Service, malicious or not, and you should have some procedure in place to handle it, perhaps you don't reprocess the same data more than a given number of times and log this to be analyzed with top priority.

Queue handling in HTTP server with heavy load

Consider a web server under very heavy load, resulting in important lag (up to ~30s response time). I've noticed that if I asynchronously request the same page multiple times (e.g. send multiple requests before the previous ones are answered), responses don't necessarily come back in the order I sent them.
Can anyone explain how the server chooses which requests to handle first? It seems there is no obvious queueing, so what makes a request get picked instead of another?

Batching generation of http responses

I'm trying to find an architecture for the following scenario. I'm building a REST service that performs some computation that can be quickly batch computed. Let's say that computing 1 "item" takes 50ms, and computing 100 "items" takes 60ms.
However, the nature of the client is that only 1 item needs to be processed at a time. So if I have 100 simultaneous clients, and I write the typical request handler that sends one item and generates a response, I'll end up using 5000ms, but I know I could compute the same in 60ms.
I'm trying to find an architecture that works well in this scenario. I.e., I would like to have something that merges data from many independent requests, processes that batch, and generates the equivalent responses for each individual client.
If you're curious, the service in question is python+django+DRF based, but I'm curious about what kind of architectural solutions/patterns apply here and if anything solving this is already available.
At first you could think of a reverse proxy detecting all pattern-specific queries, collecting all theses queries and sending it to your application in an HTTP 1.1 pipeline (pipelining is a way to send a big number of queries one after another and receiving all HTTP responses in the same order at the end, without waiting for a response after each query).
Pipelining is very hard to do well
you would have to code the reverse proxy as I do not know a way to do it
one slow response in the pipeline block all the other responses
you need an http server able to give several queries to your application language, something which never happens if the http server is not directly coded in your application, because usually http is made to work on only one query (like you never receive 2 queries in a PHP env, you receive the 1st one, send the response, and then receive the next one, even if the connection contain 2 queries).
So the good idea would be to do that on the application side. You could identify matching queries, and wait for a small amount of time (10ms?) to see if some other queries are also incoming. You will need a way to communicate between several parallel workers here (like you have 50 application workers and 10 of them have received queries that could be treated in the same batch). This way of communication could be a database (a very fast one) or some shared memory, depends on the technology used.
Then when too much time waiting has been spend (10ms?) or when a big amount of queries are received, one of the worker could collect all queries, run the batch, and tell every other workers that a result is there (here again you need a central point of communication, like LISTEN/NOTIFY in PostgreSQL, a shared memory thing, a message queue service, etc.).
Finally every worker is responsible for sending the right HTTP response.
The key here is having a system where the time you loose in trying to share requests treatment is less important than the time saved in batching several queries together, and in case of low traffic this time should stay reasonnable (as here you will always loose time waiting for nothing). And of course you are also adding some complexity on the system, harder to maintain, etc.

How to Reduce 'Waiting Time' and 'Receiving Time' on Page Load

I am using CloudFront and many time I see Wait Time and Receiving Time is too high.
According to Firebug document, Waiting time and Receiving time means:
Waiting - Waiting for a response from the server
Receiving - / (from cache) Time required to read the entire
response from the server (and/or time required to read from cache)
I do not understand why it takes so much time and what I can do to reduce the time?
There are multiple things you can do.
Set appropriate headers Expires, Cache-control, ETag etc.
Use gzipped versions of the assets
User Sprites where possible. Merge your CSS files into one, merge your JS files into one
Run your site through and go through all the recommendations.
Run your site through YSlow and go through all the recommendations
This means that the browser is waiting for the server to process the request and return the response.
When that time is long, it normally means your server-side script takes long to process the request.
There are many reasons why a server-side script is slow, e.g. a long-running database query, processing of a huge file, deep recursions, etc.
To fix that, you need to optimize your script. Besides optimizing the code itself, a simple way is to reduce the execution time for subsequent requests is to implement some kind of server-side caching.
This means the browser is receiving the response from the server.
When that time is long, it either means your network connection is slow or the received data is (too) big.
To reduce this time, you therefore need to improve the network connection and/or to reduce the size of the response.
Reducing the response size can be done by compressing the transferred data e.g. by enabling gzip and/or removing unnecessary characters like spaces from the output before outputting the data. You may also choose a different format for the returned data, where possible, e.g. use JSON instead of XML for data or directly returning HTML.
To generally reduce the waiting and receiving times you may implement some client-side caching, e.g. by setting appropriate HTTP headers like Expires, Cache-Control, etc. Then the browser will only make rather small requests to check whether there are new versions of the data to fetch.
You can also avoid the requests completely by saving the data on the client side (e.g. by putting it into the local or session storage) instead of fetching it from the server every time you need it.

How do you know when all the data has been received by the Winsock control that has issued a POST or GET to an Web Server?

I'm using the VB6 Winsock control. When I do a POST to a server I get back the response as multiple Data arrival events.
How do you know when all the data has arrived?
(I'm guessing it's when the Winsock_Close event fires)
I have used VB6 Winsock controls in the past, and what I did was format my messages in a certain way to know when all the data has arrived.
Example: Each message starts with a "[" and ends with a "]".
"[Message Text]"
When data comes in from the DataArrival event check for the end of the message "]". If it is there you received at least one whole message, and possibly the start of a new one. If more of the message is waiting, store your message data in a form level variable and append to it when the DataArrival event fires the next time.
In HTTP, you have to parse and analyze the reply data that the server is sending back to you in order to know how to read it all.
First, the server sends back a list of CRLF-delimited header lines, which are terminated by a blank CRLF-delimited line by itself. You then have to look at the actual values of the 'Content-Length' and 'Transfer-Encoding' headers to know how to read the remaining data.
If there is no 'Transfer-Encoding' header, or if it does not contain a 'chunked' item in it, then the 'Content-Length' header specifies how many remaining bytes to read. But if the 'Transfer-Encoding' header contains a 'chunked' item, then you have to read and parse the remaining data in chunks, one at a time, in order to know when the data ends (each chunk reports its own size, and the last chunk reports a size of 0).
And no, you cannot rely on the connection being closed after the reply has been sent, unless the 'Connection' header explicitally says 'close'. For HTTP 1.1, that header is usually set to 'keep-alive' instead, which means the socket is left open so the client can send more requests on the same socket.
Read RFC 2616 for more details.
No, the Close event doesn't fire when all the data has arrived, it fires when you close the connection. It's not the Winsock control's job to know when all the data has been transmitted, it's yours. As part of your client/server communication protocol implementation, you have to tell the client what to expect.
Suppose your client wants the contents of a file from the server. The client doesn't know how much data is in the file. The exchange might go something like this:
client sends request for the data in the file
the server reads the file, determines the size, attaches the size to the beginning of the data (let's say it uses 4 bytes) that tells the client how much data to expect, and starts sending it
your client code knows to strip the first 4 bytes off any data that arrives after a file request and store it as the amount of data that is to follow, then accumulate the subsequent data, through any number of DataArrival events, until it has that amount
Ideally, the server would append a checksum to the data as well, and you'll have to implement some sort of timeout mechanism, figure out what to do if you don't get the expected amount of data, etc.
