We're using flask-optimize to optimize responses of a Flask backend (serving an Angular app), the main objective being to reduce server reponse time and improve the app's performance.
Most responses data types are json, so we use the decorator #flask_optimize.optimize('json').
The observed results are: significant reduction of response size (expected, thanks to gzip compression), slight reduction of average response time, slight improvement of average Lighthouse performance score at page load.
These preliminary results are mostly positive: flask-optimize seems to be improving performance. But the tests made were limited and non-exhaustive, so these aren't conclusive results.
Is there actually any drawback of using flask-optimize?
For example, we wonder if the gzip compression, although reducing response size, could actually increase response time?
I tried to search around online but didn't find many resources about flask-optimize.
If there are no drawbacks, we would use it in all routes.
Related
My server used to handle 700+ user burst and now it is failing at around 200 users.
(Users are connecting to the server almost at the same time after clicking a push message)
I think the change is due to the change how the requests are made.
Back then, webserver collected all the information in a single response in an html.
Now, each section in a page is making a rest api request resulting in probably 10+ more requests.
I'm considering making an api endpoint to aggregate those requests for pages that users would open when they click on push notification.
Another solution I think of is caching those frequently used rest api responses.
Is it a good idea to combine api calls to reduce api calls ?
It is always a good idea to reduce API calls. The optimal solution is to get all the necessary data in one go without any unused information.
This results in less traffic, less requests (and loads) to the server, less RAM and CPU usage, as well as less concurrent DB operations.
Caching is also a great choice. You can consider both caching the entire request and separate parts of the response.
A combined API response means that there will be just one response, which will reduce the pre-execution time (where the app is loading everything), but will increase the processing time, because it's doing everything in one thread. This will result in less traffic, but a slightly slower response time.
From the user's perspective this would mean that if you combine everything, the page will load slower, but when it does it will load up entirely.
It's a matter of finding the balance.
And for the question if it's worth doing - it depends on your set-up. You should measure the start-up time of the application and the execution time and do the math.
Another thing you should consider is the amount of time this might require. There is also the solution of increasing the server power, like creating a clustered cache and using a load balancer to split the load. You should compare the needed time for both tasks and work from there.
I understand that the speed that a website loads is dependent on many things, however I'm interested to know how I can positively impact load speed by increasing the specifications on my dedicated server:
Does this allow my server to handle more requests?
Does this reduce roundtrips?
Does this decrease server response time?
Does this allow my server to generate pages on Wordpress faster?
yes-ish
no
yes-ish
yes-ish
Does this allow my server to handle more requests?
Requests come in and are essentially put into a queue until the system has enough time to handle it. By increasing system resources, such a queue might be faster processed, and such a queue might be configured to handle more requests simultaneously, so... yes-ish (note: this is very generalized)
Does this reduce roundtrips?
No, your application design is the only thing that effects this. If your application makes a request to the server, it makes a request (e.g., a "round trip"). If you increase your server resources, you do not in turn decrease the amount of requests your application makes.
Does this decrease server response time?
Yes, see first explanation. It can often decrease the response times for the same reasons given there. However, network latency and other factors outside the realm of the server can effect complete response processing times.
Does this allow my server to generate pages on Wordpress faster?
Again, see the first explanation. This can help your server generate pages faster by throwing more power at the processes that generate the pages. However, outside factors aside from the server still apply.
For performance, the two high target areas (assuming you don't have tons and tons of traffic, which most sites do not), are reducing database reads and caching. Caching covers various areas... data caching on the server, page output caching, browser caching for content, images, etc. If you're experiencing less than desirable performance, this is usually a good place to start.
What are some of the things that determine how many web requests a single machine can handle? In general what's an average number (requests per second) that most machines should be able to handle? For example, I see some answers that say 2k requests/s can easily be handled. How about 5k? 10k? etc.
I'm basically trying to do my best at estimating how many machines I'd need to scale to some high throughput, before I dive into load testing.
Yes, That is possible through performance modelling but the answers will have 5-10% error margin.
If you know exact size of web request then probably you can find your nw limit and thus this gives you maximum possible request acceptance limit. some exploratory test with sample test you can get the cpu time required by each request roughly(in terms of response time or thoughput). thus you can extrapolate the results for higher number of requests using many theorems examples, little's law. Using this theorem you can find maximum no of users (request here) can be supported on a give hw for a give acceptable response time.
but this all is done after tuning your application to expected level otherwise you will end up with lot of hw because of lack of tuning.
We are re-implementing(yes from scratch) a web application which is currently in production. We have decided to start doing some performance tests on the new app, to get some early information of the capabilities.
As the old application is currently in production and has a good performance we would like to extract some performance parameters, and then use this parameters as a reference or base goal of the performance of the new application.
Which do you think are the most relevant performance parameters we should be obtaining from the current production application?
Thanks!
Determine what pages are used the most.
Measure a latency histogram for the total time it takes to answer the request. Don't just measure the mean, measure a histogram.
From the histogram you can see how many % of requests have which latency in milliseconds. You can choose to key performance indicators by takes the values for 50% and 95%. This will tell you the average latency and the worst latency (for the worst 10% of requests).
Those two numbers alone will bring you great confidence regarding the experience your users will have.
Throughput does not matter for users, but for capacity planning.
I also recommend that you track the performance values over time and review them twice a year.
Just in case you need an HTTP client, there is weighttp, a multi-threaded client written by the guys from Lighttpd.
It has the same syntax used by ApacheBench, but weighttp lets you use several client worker threads (AB is single-threaded so it cannot saturate a modern SMP Web server).
The answer of "usr" is valid, but you can as well record the minnimum, average and maximum latencies (that's useful to see in which range they play). Here is a public-domain C program to automate all this on a given concurrency range.
Disclamer: I am involved in the development of this project.
I use Visual Studio Team System 2008 Team Suite for load testing of my Web-application (it uses ASP.MVC technology).
Load pattern:Constant (this means I have constant amount of virtual users all the time).
I specify coniguratiton of 1000 users to analyze perfomance of my Web-application in really stress conditions.I run the same load test multiple times while making some changes in my application.
But while analyzing load test results I come to a strange dependency: when average page response time becomes larger,the requests per second value increases too!And vice versa:when average page response time is less,requests per second value is less.This situation does not reproduce when the amount of users is small (5-50 users).
How can you explain such results?
Perhaps there is a misunderstanding on the term Requests/Sec here. Requests/Sec as per my understanding is just a representation of how any number of requests that the test is pushing into the application (not the number of requests completed per second).
If you look at it that way. This might make sense.
High Requests/Sec will cause higher Avg. Response Time (due to bottleneck somewhere, i.e. CPU bound, memory bound or IO bound).
So as your Requests/Sec goes up, and you have tons of object in memory, the memory is under pressure, thus triggering the Garbage Collection that will slow down your Response time.
Or as your Requests/Sec goes up, and your CPU got hammered, you might have to wait for CPU time, thus making your Response Time higher.
Or as your Request/Sec goes up, your SQL is not tuned properly, and blocking and deadlocking occurs, thus making your Response Time higher.
These are just examples of why you might see these correlation. You might have to track it down some more in term of CPU, Memory usage and IO (network, disk, SQL, etc.)
A few more details about the problem: we are load testing our rendering engine [NDjango][1] against the standard ASP.NET aspx. The web app we are using to load test is very basic - it consists of 2 static pages - no database, no heavy processing, just rendering. What we see is that in terms of avg response time aspx as expected is considerably faster, but to my surprise the number of requests per second as well as total number of requests for the duration of the test is much lower.
Leaving aside what we are testing against what, I agree with Jimmy, that higher request rate can clog the server in many ways. But it is my understanding that this would cause the response time to go up - right?
If the numbers we are getting really reflect what's happening on the server, I do not see how this rule can be broken. So for now the only explanation I have is that the numbers are skewed - something is wrong with the way we are configuring the tool.
[1]: http://www.ndjango.org NDjango
This is a normal result as the number of users increases you will load the server with higher numbers of requests per second. Any server will take longer to deal with more requests per second, meaning the average page response time increases.
Requests per second is a measure of the load being applied to the application and average page response time is a measure of the applications performance where high number=slow response.
You will be better off using a stepped number of users or a warmup period where the load is applied gradually to the server.
Also, with 1000 virtual users on a single test machine, the CPU of the test machine will be absolutely maxed out. That will most likely be the thing that is skewing the results of your testing. Playing with the number of virtual users you will find that there will be a point where the requests per second are maxed out. Adding or taking away virtual users will result in less requests per second from the app.