How much data is transfered during the first http request - performance

I would like to have a lightning fast website, which is focussed on mobile. Therefore I would like to inline as much graphics, styles and scripts as possible and only use one or two fast HTTP-Requests to display the first part of the page.
My question is how much can I inline, how big may my document get until it gets divided.
As I know so far HTTP uses TCP to send the IP Packets and TCP has a window how far the last send and highest acknowledged Packet may be appart and it scales this window.
But how much payload can be transported, before the server has to wait for an ACK of my client in worst case (first window send, no ACKs received yet). And what does it depend to, the browser, the OS, the device?

But how much payload can be transported, before the server has to wait for an ACK of my client in worst case (first window send, no ACKs received yet). And what does it depend to, the browser, the OS, the device?
It depends on the size of the socket receive buffer in the receiver.

Related

Purpose of zeromq send high watermark

The first time I skimmed the zeromq docs, I assumed that the sender high watermark was there to ensure that the sender did not get too far ahead of the receiver. Now that I'm looking at it more carefully, it seems that this can't possibly be true, since the wire protocol doesn't have any concept of ACKs so the sender can't know whether the receiver is keeping up or is way behind. After staring at jeromq code in the debugger for way too long, it seems that the watermark is actually a purely "within-same-process" mechanism to ensure that the application thread that's writing to the ZMQ socket does not get too far ahead of the background thread that's responsible for taking messages off the ZMQ socket and writing bytes into the OS's TCP socket.
It seems like a rather fringe thing to worry about, relative to how much attention it's given in the docs. It doesn't even seem like a great way to control memory usage, because if you have a high water mark of 10, then 15 messages of 2kb each is not allowed, but 5 messages of 100 megs each is allowed, so things are still pretty un-predictable.
Am I understanding all this correctly or am I hopelessly confused.
I think that another thing that says it's not to prevent a sender getting too far ahead of the receiver is that if one set the HWM to 0, that's taken as infinity not actually zero. For 0 to mean zero, it'd have to have some too-ing and fro-ing with the receiver to know whether the socket was actually empty throughout the whole connection.
I wish that 0 did mean zero, because then ZeroMQ could implement both Actor Model and Communicating Sequential Processes architectures. But it doesn't, so it can't.
Possible Uses
None the less, a potential useful aspect is related to the fact that ZeroMQ is Actor Model. Suppose one were sending messages, and it kind of mattered whether or not those messages got through. In the situation where the link has collapsed (something that ZeroMQ's heartbeat can tell you, pretty quickly), messages already sent are potentially lost forever. However, if the HWM is being used to throttle the rate of messages being sent by the application, then the number of lost messages when the link breaks is minimised.
Obviously with CSP - the perfect architecture so far as I'm concerned! - you lose no messages (because the acts of sending and receiving are an execution rendezvous; the send won't complete until the receive has also completed).
What I have done in the past is to queue up messages for transmission in the sending application, sending them as and when the socket / connection can ingest them. Having the outbound message queue in the sending application's control (instead of in ZeroMQ's control) means that sender state can potentially get ahead of the transfer of messages, but still recover easily from a network connection fault.
I have written systems where a sender has a choice of two pathways to send messages through - prime and spare - and if the link to prime has collapsed the sender continues to send to spare instead. Having queued the messages inside the application and not in the socket allows the sender's state can get ahead of the actual transfer of messages, knowing that if a link goes down it's still got all the unsent outboud messages that have been generated in the meantime. These can then be directed at spare instead, without having to rewind the sender's internal state (which could be really tricky) to the last known successful transfer.
Something like that, anyway.
"Why not send to both prime and spare anyway?" is a valid question. Well, sometimes things can be complicated...

Websocket: binary data messages order [duplicate]

If we send two messages over the same html5 websocket a split millisecond apart from each other,
Is it theoretically possible for the messages to arrive in a different order than they were sent?
Short answer: No.
Long answer:
WebSocket runs over TCP, so on that level #EJP 's answer applies. WebSocket can be "intercepted" by intermediaries (like WS proxies): those are allowed to reorder WebSocket control frames (i.e. WS pings/pongs), but not message frames when no WebSocket extension is in place. If there is a neogiated extension in place that in principle allows reordering, then an intermediary may only do so if it understands the extension and the reordering rules that apply.
It's not possible for them to arrive in your application out of order. Anything can happen on the network, but TCP will only present you the bytes in the order they were sent.
At the network layer TCP is suppose to guarantee that messages arrive in order. At the application layer, errors can occur in the code and cause your messages to be out of order in the logic of your code. It could be the network stack your application is using or your application code itself.
If you asked me, can my Node.js application guarantee sending and receiving messages in order? I'm going to have to say no. I've run websocket applications connected to WiFi under high latency and low signal. It causes very strange behavior as if packets are dropped and messages are out of sequence.
This article is a good read https://samsaffron.com/archive/2015/12/29/websockets-caution-required

Why does Websocket RFC allows control frame interleaved with multiframe

From the Websocket's RFC6455,
it's possible that control frames interleave with fragmented frames.
I don't understand the need for it, as it makes the design more complex for both sending and receiving part.
Currently, control frame can be "Close", "Ping" and "Pong" (everything else is reserved).
If the control frame is "Close", then receiving the end of the fragmentation is useless, so no interleaving would be required (the fragmenting side could just send the "Close" opcode and stop sending any more fragment, since you are not supposed to send anything after a "Close").
If the control frame is "Ping" or "Pong", it does not make any sense. The fragmenting side is sending data to the client, so why would it ask for pinging the client if it's alive (it has this information in the send system call already) ? Or reply to a ping immediately, since it's actually sending data to the client ?
So, why do we need this mechanism (of interleaved control frame) at all ?
It is to detect half open connections: http://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html
The other side could be sending you data, but unable to get your data. So being able of interleave pings and pongs, it is possible to check that at least the other end can understand your messages and reply to them.
It does not make it much more complex. You have to read delimited frames anyway, when you find a control frame, take action and continue reading more frames.
http://www.whatwg.org/specs/web-apps/current-work/multipage/network.html#ping-and-pong-frames
.3.4 Ping and Pong frames
The WebSocket protocol specification defines Ping and Pong frames that
can be used for keep-alive, heart-beats, network status probing,
latency instrumentation, and so forth. These are not currently exposed
in the API.
User agents may send ping and unsolicited pong frames as desired, for
example in an attempt to maintain local network NAT mappings, to
detect failed connections, or to display latency metrics to the user.
User agents must not use pings or unsolicited pongs to aid the server;
it is assumed that servers will solicit pongs whenever appropriate for
the server's needs.

Windows Socket TCP Client Receives data only every 200ms (QTCPSocket)

I am using QTCPSocket to connect to a TCP server (which is running on Ubuntu). The server is sending at minimum, a 1 byte packet every 40ms. My application is real-time, so it is important I receive data as fast as possible at the cost of extra network traffic.
Once I have connected a TCP Client from Windows, I start receiving packets. However, the readyRead() signal from the QTCPSocket is only emitted once every 200ms (with 5 bytes in the packet). I have looked at the packets in Wireshark, they are actually 5 byte packets coming across.
However, using QTCPSocket on Mac (the exact same code in fact), I get individual packets every time, all of my 1 byte packets sent arrive as single byte packets, which is great.
I tried creating a raw Windows socket (not using QTCPSocket), and get identical behaviour to QTCPSocket on Windows.
What is the difference causing the Mac socket to receive packets at a much higher time resolution? Is there something I can set in setsockopt() which will prevent this 200ms buffering from occuring?
I am aware that setting TCP_NODELAY on the server side will probably solve my problem, but seeing as the Mac TCP Client works as intended, there must be a way to get the same behaviour on Windows.
Setting mySocket->setSocketOption(QAbstractSocket::LowDelayOption, 1); on the server side is the only way I have found to remedy this problem
For others who stumble upon this coming from search engines:
The above (correct) answer by oggmonster can also be described by:
int on = 1;
if (setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, (char*)&on, sizeof(on)))
{
return -1;
}
You need to acknowledge each byte of data you receive to give the reply ACKs some data to piggyback on. Talk to whoever designed your protocol.
Trying to answer questions like "Why does it work on X and not on Y" is only useful when both behaviors aren't correct. If it has no application-level acknowledgements, then both behaviors are correct. If one of them shouldn't be correct, then the protocol should have a mechanism to control that, such as application-layer acknowledgements. If it doesn't the protocol is broken. Trying to figure out why a broken protocol doesn't work is pointless -- it doesn't work because it's broken.

boost::asio sending data faster than receiving over TCP. Or how to disable buffering

I have created a client/server program, the client starts
an instance of Writer class and the server starts an instance of
Reader class. Writer will then write a DATA_SIZE bytes of data
asynchronously to the Reader every USLEEP mili seconds.
Every successive async_write request by the Writer is done
only if the "on write" handler from the previous request had
been called.
The problem is, If the Writer (client) is writing more data into the
socket than the Reader (server) is capable of receiving this seems
to be the behaviour:
Writer will start writing into (I think) system buffer and even
though the data had not yet been received by the Reader it will be
calling the "on write" handler without an error.
When the buffer is full, boost::asio won't fire the "on write"
handler anymore, untill the buffer gets smaller.
In the meanwhile, the Reader is still receiving small chunks
of data.
The fact that the Reader keeps receiving bytes after I close
the Writer program seems to prove this theory correct.
What I need to achieve is to prevent this buffering because the
data need to be "real time" (as much as possible).
I'm guessing I need to use some combination of the socket options that
asio offers, like the no_delay or send_buffer_size, but I'm just guessing
here as I haven't had success experimenting with these.
I think that the first solution that one can think of is to use
UDP instead of TCP. This will be the case as I'll need to switch to
UDP for other reasons as well in the near future, but I would
first like to find out how to do it with TCP just for the sake
of having it straight in my head in case I'll have a similar
problem some other day in the future.
NOTE1: Before I started experimenting with asynchronous operations in asio library I had implemented this same scenario using threads, locks and asio::sockets and did not experience such buffering at that time. I had to switch to the asynchronous API because asio does not seem to allow timed interruptions of synchronous calls.
NOTE2: Here is a working example that demonstrates the problem: http://pastie.org/3122025
EDIT: I've done one more test, in my NOTE1 I mentioned that when I was using asio::iosockets I did not experience this buffering. So I wanted to be sure and created this test: http://pastie.org/3125452 It turns out that the buffering is there event with asio::iosockets, so there must have been something else that caused it to go smoothly, possibly lower FPS.
TCP/IP is definitely geared for maximizing throughput as intention of most network applications is to transfer data between hosts. In such scenarios it is expected that a transfer of N bytes will take T seconds and clearly it doesn't matter if receiver is a little slow to process data. In fact, as you noticed TCP/IP protocol implements the sliding window which allows the sender to buffer some data so that it is always ready to be sent but leaves the ultimate throttling control up to the receiver. Receiver can go full speed, pace itself or even pause transmission.
If you don't need throughput and instead want to guarantee that the data your sender is transmitting is as close to real time as possible, then what you need is to make sure the sender doesn't write the next packet until he receives an acknowledgement from the receiver that it has processed the previous data packet. So instead of blindly sending packet after packet until you are blocked, define a message structure for control messages to be sent back from the receiver back to the sender.
Obviously with this approach, your trade off is that each sent packet is closer to real-time of the sender but you are limiting how much data you can transfer while slightly increasing total bandwidth used by your protocol (i.e. additional control messages). Also keep in mind that "close to real-time" is relative because you will still face delays in the network as well as ability of the receiver to process data. So you might also take a look at the design constraints of your specific application to determine how "close" do you really need to be.
If you need to be very close, but at the same time you don't care if packets are lost because old packet data is superseded by new data, then UDP/IP might be a better alternative. However, a) if you have reliable deliver requirements, you might ends up reinventing a portion of tcp/ip's wheel and b) keep in mind that certain networks (corporate firewalls) tend to block UDP/IP while allowing TCP/IP traffic and c) even UDP/IP won't be exact real-time.

Resources