How to utilize all available bandwidth with real-time data? - algorithm

How to measure actual bandwidth between server and client to decide how much of real-time data to send?
My server sends read-time data to clients, 30 times per second. If server has too much data it prioritises data chunks and throws away anything that doesn't fit into available bandwidth because this data will be invalidated next tick anyway. Data is sent over reliable (20%) and unreliable channels (80%) (both UDP based but if TCP as a reliable channel can provide any benefit please let me know). Data is highly latency-sensitive. Server often (but not always!) has more data than available bandwidth. It's critical to send as much data as possible but not more than available bandwidth to avoid packets drop or higher latency.
Server and client are custom applications so can implement any algorithm/protocol.
My main problem is how to keep track of available bandwidth. Also any statistical info about typical bandwidth jitter would be helpful (servers are in a cloud, clients are home users, worldwide).
At the moment I'm thinking how to utilize:
latency info of reliable channel. It should correlate with bandwidth because if latency grows this can (!) mean retransmission is involved as result of packets drop and so server must lower data rate.
data amount received by client on unreliable channel during time frame. Especially if data amount is lower than what was sent from server.
if current latency is close to or below lowest recorded one, bandwidth can be increased
The problem is that this approach is too complicated and involves a lot of "heuristics" like what should be a step to increase/decrease bandwidth etc.
Looking for any advice from people who dealt with similar problem in the past or just any bright ideas

The first symptom of trying to use more bandwidth than you actually have will be increased latency, as you fill up the buffers between the sender and whatever the bottleneck is. See https://en.wikipedia.org/wiki/Bufferbloat. My guess is that if you can successfully detect increased latency as you start to fill up the bandwidth and back off then you can avoid packet loss.
I wouldn't underestimate TCP - people have spent a lot of time tuning its congestion avoidance to get a reasonable amount of the available bandwidth while still being a good network citizen. It may not be easy to do better.
On the other hand, a lot will depend on the attitude of the intermediate nodes, which may treat UDP differently from TCP. You may find that under load they either prioritize or discard UDP. Also some networks, especially with satellite links, may use https://en.wikipedia.org/wiki/TCP_acceleration without you even knowing about it. (This was a painful surprise for us - we relied on the TCP connection failing and keep-alive to detect loss of connectivity. Unfortunately the TCP accelerator in use maintained a connection to us, pretending to be the far end, even when connectivity to the far end had in fact been lost).

After some research, the problem has a name: Congestion Control, or Congestion Avoidance Algorithm. It's quite a complicated topic and there're lots of materials about it. TCP Congestion Control was evolving over time and is really good one. There're other protocols that implement it, e.g. UDT or SCTP

Related

What is the relationship between request content size and request duration

At the company I work, all our APIs send and expect requests/responses that follow the JSON:API standard, making the structure of the request/response content very regular.
Because of this regularity and the fact that we can have hundreds or thousands of records in one request, I think it would be fairly doable and worthwhile to start supporting compressed requests (every record would be something like < 50% of the size of its JSON:API counterpart).
To make a well informed judgement about the viability of this actually being worthwhile, I would have to know more about the relationship between request size and duration, but I cannot find any good resources on this. Anybody care to share their expertise/resources?
Bonus 1: If you were to have request performance issues, would you look at compression as a solution first, second, last?
Bonus 2: How does transmission overhead scale with size? (If I cut the size by 50%, by what percentage will the transmission overhead be cut?)
Request and response compression adds to a time and CPU penalty on both sender's side and receiver's side. The savings in time is in the transmission.
The weighing of the tradeoff depends a lot on the customers of the API -- when they make requests, how much do they request, what is requested, where they are located, type of device/os and capabilities etc.,
If the data is static -- for eg: a REST query apihost/resource/idxx returning a static resource, there are web standard approaches like caching of static resources that clients / proxies will be able to assist with.
If the data is dynamic -- there are architectural patterns that could be used.
If the data is huge -- eg: big scientific data sets, video etc., almost always you would find them being served statically with a metadata service that provides the dynamic layer. For eg: MPEG-DASH or HLS is just a collection of files.
I would choose compression as a last option relative to the other architectural options.
There are also implementation optimizations that would precede using compression of request/response. For eg:
Are your services using all available resources at disposal (cores, memory, i/o)
Does the architecture allow scale-up and scale-out and can the problem be handled effectively using that (remember the penalties on client side due to compression)
Can you use queueing, caching or other mechanisms to make things appear faster?
If you have explored all these and the answer is your system is optimal and you are looking at the most granular unit of service where data volume is an issue, by all means go after compression. Keep in mind that you need to budget compute resources for compression on the server side as well (for a fixed workload).
Your question#2 on transmission overhead vs size is a question around bandwidth and latency. Bandwidth determines how much you can push through the pipe. Latency governs the perceived response times. Whether the payload is 10 bytes or 10MB, latency for a client across the world encountering multiple hops will be larger relative to a client encountering only one or two hops and is bound by the round-trip time. So, a solution may be to distribute the servers and place them closer to your clients from across the world rather than compressing data. That is another reason why compression isn't the first thing to look at.
Baseline your performance and benchmark your experiments for a representative user base.
I think what you are weighing here is going to be the speed of your processor / cpu vs the speed of your network connection.
Network connection can be impacted by things like distance, signal strength, DNS provider, etc; whereas, your computer hardware is only limited by how much power you've put in it.
I'd wager that compressing your data before you are sending would result in shorter response times, yes, but it's=probably going to be a very small amount. If you are sending json, usually text isn't all that large to begin with, so you would probably only see a change in performance at the millisecond level.
If that's what you are looking for, I'd go ahead and implement it, set some timing before and after, and check your results.

Gauging a web browser's bandwidth

Is it possible to gauge a web browsers upload and/or download speed by monitoring normal http requests? Ideally a web application would be able to tell the speed of a client without any modifications and without client-side scripting like JavaScript/Java/Flash. So even if a client was accessing the service with a library like Curl it would still work. If this is possible, how? If its not possible, why? How accurate can this method be?
(If it helps assume PHP/Apache, but really this is a platform independent question. Also being able to gauge the upload speed is more important to me.)
Overview
You're asking for what is commonly called "passive" available bandwidth (ABW) measurement along a path (versus measuring a single link's ABW). There are a number of different techniques1 that estimate bandwidth using passive observation, or low-bandwidth "Active" ABW probing techniques. However, the most common algorithms used in production services are active ABW techniques; they observe packet streams from two different end-points.
I'm most familiar with yaz, which sends packets from one side and measures variation in delay on the other side. The one-sided passive path ABW measurement techniques are considered more experimental; there aren't solid implementations of the algorithms AFAIK.
Discussion
The problem with the task you've asked for is that all non-intrusive2 ABW measurement techniques rely on timing. Sadly, timing is a very tricky thing when working with http...
You have to deal with the reality of object caching (for instance, akamai) and http proxies (which terminate your TCP session prematurely and often spoof the web-server's IP address to the client).
You have to deal with web-hosts which may get intermittently slammed
Finally, active ABW techniques rely on a structured packet stream (wrt packet sizes and timing), unlike what you see in a standard http transfer.
Summary
In summary, unless you set up dedicated client / server / protocol just for ABW measurement, I think you'll be rather frustrated with the results. You can keep your ABW socket connections on TCP/80, but the tools I have seen won't use http3.
Editorial note: My original answer suggested that ABW with http was possible. On further reflection, I changed my mind.
END-NOTES:
---
See Sally Floyd's archive of end-to-end TCP/IP bandwidth estimation tools
The most common intrusive techniques (such as speedtest.net) use a flash or java applet in the browser to send & receive 3-5 parallel TCP streams to each endpoint for 20-30 seconds. Add the streams' average throughput (not including lost packets requiring retransmission) over time, and you get that path's tx and rx ABW. This is obviously pretty disruptive to VoIP calls, or any downloads in progress. Disruptive meausurements are called bulk transfer capacity (BTC). See RFC 3148: A Framework for Defining Empirical Bulk Transfer Capacity Metrics. BTC measurements often use HTTP, but BTC doesn't seem to be what you're after.
That is good, since it removes the risk of in-line caching by denying http caches an object to cache; although some tools (like yaz) are udp-only.
Due to the way TCP connections adapt to available bandwidth, no this is not possible. Requests are small and typically fit within one or two packets. You need a least a dozen full-size packets to get even a coarse bandwidth estimate, since TCP first has to scale up to available bandwidth ("TCP slow start"), and you need to average out jitter effects. If you want any accuracy, you're probably talking hundreds of packets required. That's why upload rate measurement scripts typically transfer several megabytes of data.
OTOH, you might be able to estimate round-trip delay from the three-way handshake and the timing of acks. But download speed has at least as much impact as upload speed.
There's no support in javascript or any browser component to measure upload performance.
The only way I can think of is if you are uploading to a page/http handler, and the page is receiving the incoming bytes, it can measure how many bytes it is receiving per second. Then store that in some application wide dictionary with a session ID.
Then from the browser you can periodically poll the server to get the value in the dictionary using the session ID and show it to user. This way you can tell how's the upload speed.
You can use AJAXOMeter, a JavaScript library which meassures your up- and download speed. You can see a live demo here.
That is not feasible in general as in-bound and out-bound bandwidth frequently is not symmetric. Different ISPs have significantly different ratios here that can vary on even time of the day basis.

Measuring link quality between two machines

Are there some standard methods(libraries) for measuring quality of link/connection between two computers.
This results would be used to improve routing logic. If connection condition is unacceptable stop data transfer to that computer and initiate alternative route for that transfer. It looks like Skype has some of this functionality.
I was thinking to establish several continuous testing streams that can show bandwidth problems, and some kind of ping-pong messaging logic to show latency values.
Link Reliability
I usually use a continuous traceroute (i.e. mtr) for isolating unreliable links; but for your purposes, you could start with average ping statistics as #recursive mentioned. Migrate to more complicated things (like a UDP/TCP echo protocol) if you find that ICMP is getting blocked too often by client firewalls in the path.
Bandwidth / Delay Estimation
For bandwidth and delay estimation, yaz provides a low-bandwidth algorithm to estimate throughput / delay along the path; it uses two different endpoints for measurement, so your client and servers will need to coordinate their usage.
Sally Floyd maintains a pretty good list of bandwidth estimation tools that you may want to check out if yaz isn't what you are looking for.
Ping is good for testing latency, but not bandwidth.

how TCP can be tuned for high-performance one-way transmission?

my (network) client sends 50 to 100 KB data packets every 200ms to my server. there're up to 300 clients. Server sends nothing to client. Server (dedicated) and clients are in LAN. How can I tune TCP configuration for better performance? Server on Windows Server 2003 or 2008, clients on Windows 2000 and up.
e.g. TCP window size. Does changing this parameter help? anything else? any special socket options?
[EDIT]: actually in different modes packets can be up to 5MB
I did a study on this a couple of years ago wth 1700 data points. The conclusion was that the single best thing you can do is configure an enormous socket receive buffer (e.g. 512k) at the receiver. Do that to the listening socket, so it will be inherited by the accepted sockets, so it will already be set while they are handshaking. That in turn allows TCP window scaling to be negotiated during the handshake, which allows the client to know about the window size > 64k. The enormous window size basically lets the client transmit at the maximum possible rate, subject only to congestion avoidance rather than closed receive windows.
What OS?
IPv4 or v6?
Why so large of a dump ; why can't it be broken down?
Assuming a solid, stable, low bandwidth:delay prod, you can adjust things like inflight sizing, initial window size, mtu (depending on the data, IP version, and mode[tcp/udp].
You could also round robin or balance inputs, so you have less interrupt time from the nic .. binding is an option as well..
5MB /packet/? That's a pretty poor design .. I would think it'd lead to a lot of segment retrans's , and a LOT of kernel/stack mem being used in sequence reconstruction / retransmits (accept wait time, etc)..
(Is that even possible?)
Since all clients are in LAN, you might try enabling "jumbo frames" (need to run a netsh command for that, would need to google for the precise command, but there are plenty of how-tos).
On the application layer, you could use TransmitFile, which is the Windows sendfile equivalent and which works very well under Windows Server 2003 (it is artificially rate-limited under "non server", but that won't be a problem for you). Note that you can use a memory mapped file if you generate the data on the fly.
As for tuning parameters, increasing the send buffer will likely not give you any benefit, though increasing the receive buffer may help in some cases because it reduces the likelihood of packets being dropped if the receiving application does not handle the incoming data fast enough. A bigger TCP window size (registry setting) may help, as this allows the sender to send out more data before having to block until ACKs arrive.
Yanking up the program's working set quota may be worth a consideration, it costs you nothing and may be an advantage, since the kernel needs to lock pages when sending them. Being allowed to have more pages locked might make things faster (or might not, but it won't hurt either, the defaults are ridiculously low anyway).

Cannot achieve full speed on Symmetrical Internet Connection

We are using a business Ethernet connection (3Mbit upload, 3Mbit download) and trying to understand issues with our tested bandwidth speeds. When uploading a large file we sustain 340 KB/s; downloading we sustain 340KB/s. However when we run these transfers simultaneously the two transfer speeds rise and fall erratically with a average speed for both at around 250 KB/s. We're using a Hatteras HN404 CPi and we've bypassed the router (plugged a machine directly into the Hatteras; set the NIC to full-duplex).
Is this expected? Should a max upload interfere with a max download on this type of Internet connection?
Are you sure the bottleneck is your connection?
Do you also see this behavior when the simultaneous upload and download are occurring on different systems, or only when one system is handling both the upload and download?
If the problem goes away when independent machines are doing the work, the bottleneck is likely closer to the hard drive.
This sounds expected from my experience with lower end lines. On a home line, I've found that traffic shaping and changing buffer sizes can be a huge help.
TCP/IP without any unusual traffic shaping will favor the most aggressive traffic at the expense of everything else. In your case, this means responses to the outgoing ACKs and such for the download will be delayed or maybe even dropped. See if your HN404 supports class based queuing or something similar and try it out.
Yes it is expected. This is symptomatic of any case in which you have a throttled or capped connection. If you saturate your uplink it will affect your downlink and vice versa.
This is because the your connection's rate-limiting impacts the TCP handshake acknowledgement packets (ACKs) and disrupts the normal "balance" of how these packets flow.
This is very thoroughly described on this page about Cable Modem Troubleshooting Tips, although it is not limited to cable modems:
If you saturate your cable modem's
upload cap with an upload, the ACK
packets of your download will have to
queue up waiting for a gap between the
congested upload data packets. So your
ACKs will be delayed getting back to
the remote download server, and it
will therefore believe you are on a
very slow link, and slow down the
transmission of further data to you.
So how do you avoid this? The best way is to implement some sort of traffic-shaping or QoS (Quality of Service) on individual sessions to limit them to a maximum throughput based on a percentage of your total available bandwidth.
For example on my home network I have it so that no outbound connection can utilize any more than 67% (2/3rd) of my 192Kbps uplink. That means any single outbound session can only utilized 128Kbps, therefore protecting my downlink speed by preventing the uplink from becoming saturated.
In most cases you are able to perform this kind of traffic-shaping based on any available criteria such as source ip, destination ip, protocol, port, time of day, etc.
It appears that I was wrong about the simultaneous transfer speeds. The 250KB/s speeds up and down were miscalculated by the transfer program (seemed to have been showing a high average speed). Apparently the Business Ethernet (in this case it is an XO circuit provisioned by Speakeasy) only supports 3Mb total, not up AND down (for 6Mbit total). So if I am transferring up and down at the same time in theory I should only have 1.5Mbit up and down or 187.5KB/s at the maximum (if there was zero overhead).

Resources