adaptive http request throttling algorithm

adaptive http request throttling algorithm - algorithm

I'm currently in a spray based web application backend project. if you don't know what spray is , never mind, just trait it as a back end http request handling system. unfortunately, there is no existing request throttling support with srpay. so I'd like to write my own.
I don't want to use token bucket or similar algorithm, because the capacity of the server is pre-configured. you maybe give a very conservative estimation which far behind the server's real capacity.
so what I'd like to do is let the server actually learning its' own capacity by the feedback of request
per second, request per second response. request handling per second, and average response time.mainly the four statistics, but not limited to them.
It's adaptive throttling, so the system is dynamically aware the actual request handling capacity.
Anyone can suggest some existing algorithms or related papers.

Related

Server load when pushing the same small payload to a very large amounts of clients

I am wondering what would be the best strategy to send the same small(<100B) payload to large amounts of clients and not paying bank for the server resources.
I am trying to create an API that would synchronize multiple media players to one source for the purpose of watch parties through async data pushing through http. I don't need to authenticate clients and the data is not sensitive. The payload will be the same for everyone and very small ~20-40 unicode chars. I want the payload to be able to update every 2-3 seconds, but I predict a median update every 30-60s. My limitation is that I want to be able to serve up to a million users at the same time and make it free to use.
I am not sure how to balance the cost of the server resources and high performance of a possibility of a lot of quick updates to clients. Are there any resources that would help me understand the balancing of server cost/performance in my use case? What is the best way to approach this problem from a technical standpoint?
So websockets are out of the question, since streaming data and keeping up sessions is costly, right? Are AJAX pushes the most lightweight way to approach it? How does the fact that the payload is the same for everyone influences possible strategies of lightening the load? Would lack of auth greatly influence the load? Is P2P out of the question?

Best approach would be to use websockets
https://socket.io/ would be a very good starting point,
or https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API
another approach is to use AJAX (as you mention), but that would carry the whole HTTP protocol data with it, on every request, making your payload doubled (if it's that small).
So binary data via websocket seems the better solution.

Is combining rest api calls to reduce # requests worth doing?

My server used to handle 700+ user burst and now it is failing at around 200 users.
(Users are connecting to the server almost at the same time after clicking a push message)
I think the change is due to the change how the requests are made.
Back then, webserver collected all the information in a single response in an html.
Now, each section in a page is making a rest api request resulting in probably 10+ more requests.
I'm considering making an api endpoint to aggregate those requests for pages that users would open when they click on push notification.
Another solution I think of is caching those frequently used rest api responses.
Is it a good idea to combine api calls to reduce api calls ?

It is always a good idea to reduce API calls. The optimal solution is to get all the necessary data in one go without any unused information.
This results in less traffic, less requests (and loads) to the server, less RAM and CPU usage, as well as less concurrent DB operations.
Caching is also a great choice. You can consider both caching the entire request and separate parts of the response.
A combined API response means that there will be just one response, which will reduce the pre-execution time (where the app is loading everything), but will increase the processing time, because it's doing everything in one thread. This will result in less traffic, but a slightly slower response time.
From the user's perspective this would mean that if you combine everything, the page will load slower, but when it does it will load up entirely.
It's a matter of finding the balance.
And for the question if it's worth doing - it depends on your set-up. You should measure the start-up time of the application and the execution time and do the math.
Another thing you should consider is the amount of time this might require. There is also the solution of increasing the server power, like creating a clustered cache and using a load balancer to split the load. You should compare the needed time for both tasks and work from there.

What additional overheads are there to sending a packet over a websocket connection?

When performing AJAX requests, I have always tried to do as few as possible since there is an overhead to each request having to open the http connection to send the data. Since a websocket connection is constantly open, is there any cost outside of the obvious packet bandwidth to sending a request?
For example. Over the space of 1 minute, a client will send 100kb of data to the server. Assuming the client does not need a response to any of these requests, is there any advantage to queuing packets and sending them in one big burst vs sending them as they are ready?
In other words, is there an overhead to the stopping and starting data transfer for a connection that is constantly open?
I want to make a multiplayer browser game as real time as possible, but I don't want to find that 100s of tiny requests per minute compared to a larger consolidated request is causing the server additional stress. I understand that if the client needs a response it will be slower as there is a lot of waiting from the back and forth. I will consider this and only consolidate when it is appropriate. The more smaller requests per minute, the better user experience, but I don't know what toll it will have on the server.

You are correct that a webSocket message will have lower overhead for a given message transmission than sending the same message via an Ajax call because the webSocket connection is already established and because a webSocket message has lower overhead than an HTTP request.
First off, there's always less overhead in sending one larger transmission vs. sending lots of smaller transmissions. That's just the nature of TCP. Every TCP packet gets separately processed and acknowledged so sending more of them costs a bit more overhead. Whether that difference is relevant or significant and worth writing extra code for or worth sacrificing some element of your user experience (because of the delay for batching) depends entirely upon the specifics of a given situation.
Since you've described a situation where your client gets the best experience if there is no delay and no batching of packets, then it seems that what you should do is not implement the batching and test out how your server handles the load with lots of smaller packets when it gets pretty busy. If that works just fine, then stay with the better user experience. If you have issues keeping up with the load, then seriously profile your server and find out where the main bottleneck to performance is (you will probably be surprised about where the bottleneck actually is as it is often not where you think it will be - that's why you have to profile and measure to know where to concentrate your energy for improving the scalability).
FYI, due to the implementation of Nagel's algorithm in most implementations of TCP, the TCP stack itself does small amounts of batching for you if you are sending multiple requests fairly closely spaced in time or if sending over a slower link.
It's also possible to implement a dynamic system where as long as your server is able to keep up, you keep with the smaller and more responsive packets, but if your server starts to get busy, you start batching in order to reduce the number of separate transmissions.

In what ways does more RAM and Processing power on my server make my website faster?

I understand that the speed that a website loads is dependent on many things, however I'm interested to know how I can positively impact load speed by increasing the specifications on my dedicated server:
Does this allow my server to handle more requests?
Does this reduce roundtrips?
Does this decrease server response time?
Does this allow my server to generate pages on Wordpress faster?

yes-ish
no
yes-ish
yes-ish
Does this allow my server to handle more requests?
Requests come in and are essentially put into a queue until the system has enough time to handle it. By increasing system resources, such a queue might be faster processed, and such a queue might be configured to handle more requests simultaneously, so... yes-ish (note: this is very generalized)
Does this reduce roundtrips?
No, your application design is the only thing that effects this. If your application makes a request to the server, it makes a request (e.g., a "round trip"). If you increase your server resources, you do not in turn decrease the amount of requests your application makes.
Does this decrease server response time?
Yes, see first explanation. It can often decrease the response times for the same reasons given there. However, network latency and other factors outside the realm of the server can effect complete response processing times.
Does this allow my server to generate pages on Wordpress faster?
Again, see the first explanation. This can help your server generate pages faster by throwing more power at the processes that generate the pages. However, outside factors aside from the server still apply.
For performance, the two high target areas (assuming you don't have tons and tons of traffic, which most sites do not), are reducing database reads and caching. Caching covers various areas... data caching on the server, page output caching, browser caching for content, images, etc. If you're experiencing less than desirable performance, this is usually a good place to start.

Use traditional post, Ajax post or Channel API for my own Like button?

I'm developing a mobile app with 40 million users per day.
The app will show articles to the user that they can choose to read, no image, just plain text. The user can pull to refresh to get new articles.
I would like to implement the like button to each individual article (my own like button not Facebook). Assume that each client click 100 like per person per day it will be equal to 40M x 100 = 4000 M time of data transfer.
I'm a newbie with no experience with big project before. What is the best approach that suit my project. I found Google Channel API is 0.0001 dollars per channel created which is 80M x 0.0001 = 8000 USD per day (assume there are 2 connection per person) which is quite expensive. Or there is other way to do this? ex. Ajax or Traditional post. My app don't need real-time. which one is less resource consumption? Can someone please guide me. I really need help.
I'm plan to use google app engine for this project.

A small difference in efficiency would multiply to a significant change in operational costs at those volumes. I would not blindly trust theoretical claims made by documentation. It would be sensible to build and test each alternative design and ensure it is compatible with the rest of your software. A few days of trials with several thousand simulated users will produce valuable results at a bearable cost.
Channels, Ajax and conventional web requests are all feasible at the conceptual level of your question. Add in some monitoring code and compare the results of load tests at various levels of scale. In addition to performance and cost, the instrumentation code should also monitor reliability.

I highly doubt your app will get 40 million users a day, and doubt even more that each of those will click Like ten times a day.
But I don't understand why clicking Like would result in a lot of data transfer. It's a simple Ajax request, which wouldn't even need to return anything than an empty response, with a 200 code for success and a 400 for failure.

I would experiment with different options on a small scale first, to get some data from which you can extrapolate to calculate your costs. However, a simple ajax request with a lightweight handler is likely to be cheaper than Channel API.
If you are getting 40m daily users, reading at least 100 articles, then making 100 likes, I'm guessing you will have something exceeding 8bn requests a day. On that basis, your instance costs are likely to be significant before even considering a like button. At that volume of requests, how you handle the request on the server side will be important in managing your costs.
Using tools like AppStats, Chrome Dev Tools and Apache Jmeter will help you get a better view on your response times, instance & bandwidth costs and user experience before you scale up.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio