When getting a list of items via a json call is it better to use several small calls (to get info as it's needed) or one large call with all the data.
For example, you have a json call to get a list of books matching a particular title keyword. There are 100 results. You're displaying the data in a paginated form - 10 results per 'page'. Is it more efficient to make one call and get all the results or to make a call for the next 10 on each page?
I would imagine it's partly determined by how many results there are. If it's some huge number the second option seems clear. But what is a good limit to the number you can get in one call - 100, 1000, 10,000 items?
Generally, each ajax call has an overhead and lowering the number of different calls makes the performance better.. unless the data is large...
In paging it is generally better not to fetch all data from the beginning because usually users don't move through all pages.. so you could lower the load on the server by not moving the data .... on another hand, if the data is relatively small or you believe the user will need to see all the data, fetch them to save the overhead of different calls ...
It depends.
Obviously, you want to keep the bandwidth usage to a minimum, but there is also an overhead to each individual call. You'll have to make some educated guesses, most importantly: how likely is it that you are going to need the data from pages 2 to 100?
If it is very likely (say, in 90% of the cases users are going to click through many pages of the same result set), then I'd download the whole result in one go, but otherwise, I'd load individual pages as you go.
Another thing to keep in mind is latency. Every ajax call has a certain latency, depending on the distance (in network topology, not necessarily geographical) between client and server. For the first load, the latency is inevitable, but after that, you need to ask yourself whether fast response is important. Under normal circumstances, it is expected and acceptable, but if your typical use case involves flipping back and forth between pages a lot, then it might become a nuisance, and you might consider buying snappiness for a longer initial loading time.
If you want to load multiple pages, but the result set is too large (say, thousands or millions of pages), you might think about more sophisticated schemes, e.g., download the requested page and the next 10, or download the requested page immediately and then prefetch the next 10 pages in the background.
Related
My server used to handle 700+ user burst and now it is failing at around 200 users.
(Users are connecting to the server almost at the same time after clicking a push message)
I think the change is due to the change how the requests are made.
Back then, webserver collected all the information in a single response in an html.
Now, each section in a page is making a rest api request resulting in probably 10+ more requests.
I'm considering making an api endpoint to aggregate those requests for pages that users would open when they click on push notification.
Another solution I think of is caching those frequently used rest api responses.
Is it a good idea to combine api calls to reduce api calls ?
It is always a good idea to reduce API calls. The optimal solution is to get all the necessary data in one go without any unused information.
This results in less traffic, less requests (and loads) to the server, less RAM and CPU usage, as well as less concurrent DB operations.
Caching is also a great choice. You can consider both caching the entire request and separate parts of the response.
A combined API response means that there will be just one response, which will reduce the pre-execution time (where the app is loading everything), but will increase the processing time, because it's doing everything in one thread. This will result in less traffic, but a slightly slower response time.
From the user's perspective this would mean that if you combine everything, the page will load slower, but when it does it will load up entirely.
It's a matter of finding the balance.
And for the question if it's worth doing - it depends on your set-up. You should measure the start-up time of the application and the execution time and do the math.
Another thing you should consider is the amount of time this might require. There is also the solution of increasing the server power, like creating a clustered cache and using a load balancer to split the load. You should compare the needed time for both tasks and work from there.
As I understand, the benefit of using memcached is to shorten the access time to the information stored in the database by caching it in the memory. But isn't the time overhead for the client-server model based on network protocol (e.g. TCP) also considerable as well? My guess is that it actually might be worse as network access is generally slower than hardware access. What am I getting wrong?
Thank you!
It's true that caching won't address network transport time. However, what matters to the user is the overall time from request to delivery. If this total time is perceptible, then your site does not seem responsive. Appropriate use of caching can improve responsiveness, even if your overall transport time is out of your control.
Also, caching can be used to reduce overall server load, which will essentially buy you more cycles. Consider the case of a query whose response is the same for all users - for example, imagine that you display some information about site activity or status every time a page is loaded, and this information does not depend on the identity of the user loading the page. Let's imagine also that this information does not change very rapidly. In this case, you might decide to recalculate the information every minute, or every five minutes, or every N page loads, or something of that nature, and always serve the cached version. In this case, you're getting two benefits. First, you've cut out a lot of repeated computation of values that you've decided don't really need to be recalculated, which takes some load off your servers. Second, you've ensured that users are always getting served from the cache rather than from computation, which might speed things up for them if the computation is expensive.
Both of those could - in the right circumstances - lead to improved performance from the user's perspective. But of course, as with any optimization, you need to have benchmarks and actually benchmark to data rather than to your perceptions of what ought to be correct.
I am working on a webshop-type application. One feature I often see in other websites is a breakdown of filtering options, with after that a total of how many results that filtering option will have. You often see this on computer sites (e.g. Newegg) or used car sites. Example:
CPU:
* AMD (315)
* Intel (455)
Video card:
* ATI (378)
* Nvidia (402)
How can I efficiently calculate these totals? The website I a working on will have many different products (10.000+) with many different options. To make matters worse, the products are constantly changing.
Trying to precalculate all the different filtering combination totals seems unfeasable. If I have 5 different filters with 4 options each, the number of option possibilities would be 20 * 16 * 12 * 8 * 4 = 122880. It would take a long time to calculate that.
Another option would be to query on-demand and cache the results (e.g. in Redis). But how could I manage the cache efficiently if products keep being added and removed? The caches would often be stale. I'm afraid I'd have to micro-manage cache invalidation somehow leading to a very complex and brittle implementation. The alternative would be to invalidate broad sections of cache. But immediately after invalidating, my database would be rushed by hunderds of queries from active users who need these totals recalculated.
Is there a nice and elegant way to handle this?
I see no problem with showing live data for your case. Not to discourage you in any way, but 10K products is not a lot, performance wise. Several millions, on the other hand, is.
Did you actually try to implement it this way and found that it performs slowly, or you are just over-conscious about its theoretical performance? I suggest you do some stress-testing on your system AS IS, and see if it's worth improving. Still, here are some ideas to make it faster:
Do not populate all counts at once, only if specific category is expanded/clicked. So you would always end up with a single SELECT cat_name, COUNT(*) GROUP BY cat_name query, which should not take up much time. Single and relatively light query like this per user click sounds reasonable for me.
Let the database engine manage caching for you. If you perform similar queries often, your database engine should automatically optimize the underlying storage (i.e. move that whole table to memory or similar). You just need to make sure the instance has enough memory.
Upgrade server hardware, if needed. If the amount of data increases, you may not have enough memory to store everything. Don't panic yet, you can still put an SSD in, or install a 12 core Xeon processor into the server, depending on where the bottleneck is.
What about thinking from the othey way arround and keeping the numbers in the database? You could probably use triggers to automatically increment / decrement the counters in case a product is added to / removed from a given cathegory (if not, it can still be handled explicitly by the dialog that lets the store manager add / remove products on sale).
This seems like a good solution since a) i suppose the names of the cathegories are stored in the DB already so asking for the numbers incurs very little overhead, and b) even though the products are constantly changing, they are most likely changing with much lower frequency than is the frequency of requests (that still holds even if the users themselves can add/remove products). And finally c) there is no complicated caching scheme, the counters are being managed on a single place, by a single part of the code. Should be easy to keep it error-free.
I am trying to spread out data that is received in bursts. This means I have data that is received by some other application in large bursts. For each data entry I need to do some additional requests on some server, at which I should limit the traffic. Hence I try to spread up the requests in the time that I have until the next data burst arrives.
Currently I am using a token-bucket to spread out the data. However because the data I receive is already badly shaped I am still either filling up the queue of pending request, or I get spikes whenever a bursts comes in. So this algorithm does not seem to do the kind of shaping I need.
What other algorithms are there available to limit the requests? I know I have times of high load and times of low load, so both should be handled well by the application.
I am not sure if I was really able to explain the problem I am currently having. If you need any clarifications, just let me know.
EDIT:
I'll try to clarify the problem some more and explain, why a simple rate limiter does not work.
The problem lies in the bursty nature of the traffic and the fact, that burst have a different size at different times. What is mostly constant is the delay between each burst. Thus we get a bunch of data records for processing and we need to spread them out as evenly as possible before the next bunch comes in. However we are not 100% sure when the next bunch will come in, just aproximately, so a simple divide time by number of records does not work as it should.
A rate limiting does not work, because the spread of the data is not sufficient this way. If we are close to saturation of the rate, everything is fine, and we spread out evenly (although this should not happen to frequently). If we are below the threshold, the spreading gets much worse though.
I'll make an example to make this problem more clear:
Let's say we limit our traffic to 10 requests per seconds and new data comes in about every 10 seconds.
When we get 100 records at the beginning of a time frame, we will query 10 records each second and we have a perfect even spread. However if we get only 15 records we'll have one second where we query 10 records, one second where we query 5 records and 8 seconds where we query 0 records, so we have very unequal levels of traffic over time. Instead it would be better if we just queried 1.5 records each second. However setting this rate would also make problems, since new data might arrive earlier, so we do not have the full 10 seconds and 1.5 queries would not be enough. If we use a token bucket, the problem actually gets even worse, because token-buckets allow bursts to get through at the beginning of the time-frame.
However this example over simplifies, because actually we cannot fully tell the number of pending requests at any given moment, but just an upper limit. So we would have to throttle each time based on this number.
This sounds like a problem within the domain of control theory. Specifically, I'm thinking a PID controller might work.
A first crack at the problem might be dividing the number of records by the estimated time until next batch. This would be like a P controller - proportional only. But then you run the risk of overestimating the time, and building up some unsent records. So try adding in an I term - integral - to account for built up error.
I'm not sure you even need a derivative term, if the variation in batch size is random. So try using a PI loop - you might build up some backlog between bursts, but it will be handled by the I term.
If it's unacceptable to have a backlog, then the solution might be more complicated...
If there are no other constraints, what you should do is figure out the maximum data rate that you are comfortable with sending additional requests, and limit your processing speed according to that. Then monitor what is happening. If that gets through all of your requests quickly, then there is no harm . If its sustained level of processing is not fast enough, then you need more capacity.
I recently completed development of a mid-traficked(?) website (peak 60k hits/hour), however, the site only needs to be updated once a minute - and achieving the required performance can be summed up by a single word: "caching".
For a site like SO where the data feeding the site changes all the time, I would imagine a different approach is required.
Page cache times presumably need to be short or non-existent, and updates need to be propogated across all the webservers very rapidly to keep all users up to date.
My guess is that you'd need a distributed cache to control the serving of data and pages that is updated on the order of a few seconds, with perhaps a distributed cache above the database to mediate writes?
Can those more experienced that I outline some of the key architectural/design principles they employ to ensure highly interactive websites like SO are performant?
The vast majority of sites have many more reads than writes. It's not uncommon to have thousands or even millions of reads to every write.
Therefore, any scaling solution depends on separating the scaling of the reads from the scaling of the writes. Typically scaling reads is really cheap and easy, scaling the writes is complicated and costly.
The most straightforward way to scale reads is to cache entire pages at a time and expire them after a certain number of seconds. If you look at the popular web-site, Slashdot. you can see that this is the way they scale their site. Unfortunately, this caching strategy can result in counter-intuitive behaviour for the end user.
I'm assuming from your question that you don't want this primitive sort of caching. Like you mention, you'll need to update the cache in place.
This is not as scary as it sounds. The key thing to realise is that from the server's point of view. Stackoverflow does not update all the time. It updates fairly rarely. Maybe once or twice per second. To a computer a second is nearly an eternity.
Moreover, updates tend to occur to items in the cache that do not depend on each other. Consider Stack Overflow as example. I imagine that each question page is cached separately. Most questions probably have an update per minute on average for the first fifteen minutes and then probably once an hour after that.
Thus, in most applications you barely need to scale your writes. They're so few and far between that you can have one server doing the writes; Updating the cache in place is actually a perfectly viable solution. Unless you have extremely high traffic, you're going to get very few concurrent updates to the same cached item at the same time.
So how do you set this up? My preferred solution is to cache each page individually to disk and then have many web-heads delivering these static pages from some mutually accessible space.
When a write needs to be done it is done from exactly one server and this updates that particular cached html page. Each server owns it's own subset of the cache so there isn't a single point of failure. The update process is carefully crafted so that a transaction ensures that no two requests are not writing to the file at exactly the same time.
I've found this design has met all the scaling requirements we have so far required. But it will depend on the nature of the site and the nature of the load as to whether this is the right thing to do for your project.
You might be interested in this article which describes how wikimedia's servers are structured. Very enlightening!
The article links to this pdf - be sure not to miss it.