Is there a theorem like CAP for web development? - performance

When you're building something in a web development scenario you're often thinking of costs/resources, and you're often juggling between three resources:
CPU (Processing in general)
Memory (Storage in general)
Network/Bandwidth (Or maybe even external/server resources)
The theorem here is simple, you can only choose two of these to be low.
If you want low CPU and Memory, you'll have to ask the server to do the work (High bandwidth usage)
If you want low Memory and Bandwidth, the CPU will have to do extra work to create and recreate things on the go.
If you want low CPU and Bandwidth, the memory will have to store more information and possibly duplicated data.
My question here, is there a name for this theorem? Or the managing of these 3 resources? I would like to know more about the theory behind choosing the best options in each scenario and researches related to this.
To be honest I don't know if this is the right community for this question, it is mostly a theoretical/academic question.

It doesn't work this way. For all 3 resources you can achieve low usage percentage by simply adding more resources in most cases. If you have high CPU usage then deploy your app to the server with two times more CPUs. So if you have money you can achieve low usage levels. In CAP it doesn't matter how much money you have.
Also from Wikipedia:
CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made

Related

What methodology would you use to measure the load capacity of a software server application?

I have a high-performance software server application that is expected to get increased traffic in the next few months.
I was wondering what approach or methodology is good to use in order to gauge if the server still has the capacity to handle this increased load?
I think you're looking for Stress Testing and the scenario would be something like:
Create a load test simulating current real application usage
Start with current number of users and gradually increase the load until
you reach the "increased traffic" amount
or errors start occurring
or you start observing performance degradation
whatever comes the first
Depending on the outcome you either can state that your server can handle the increased load without any issues or you will come up with the saturation point and the first bottleneck
You might also want to execute a Soak Test - leave the system under high prolonged load for several hours or days, this way you can detect memory leaks or other capacity problems.
More information: Why ‘Normal’ Load Testing Isn’t Enough
Test the product with one-tenth the data and traffic. Be sure the activity is 'realistic'.
Then consider what will happen as traffic grows -- with the RAM, disk, cpu, network, etc, grow linearly or not?
While you are doing that, look for "hot spots". Optimize them.
Will you be using web pages? Databases? Etc. Each of these things scales differently. (In other words, you have not provided enough details in your question.)
Most canned benchmarks focus on one small aspect of computing; applying the results to a specific application is iffy.
I would start by collecting base line data on critical resources - typically, CPU, memory usage, disk usage, network usage - and track them over time. If any of those resources show regular spikes where they remain at 100% capacity for more than a fraction of a second, under current usage, you have a bottleneck somewhere. In this case, you cannot accept additional load without likely outages.
Next, I'd start figuring out what the bottleneck resource for your application is - it varies between applications, but in most cases it's the bottleneck resource that stops you from scaling further. Your CPU might be almost idle, but you're thrashing the disk I/O, for instance. That's a tricky process - load and stress testing are the way to go.
If you can resolve the bottleneck by buying better hardware, do so - it's much cheaper than rewriting the software. If you can't buy better hardware, look at load balancing. If you can't load balance, you've got to look at application architecture and implementation and see if there are ways to move the bottleneck.
It's quite typical for the bottleneck to move from one resource to the next - you've got CPU to behave, but now when you increase traffic, you're spiking disk I/O; once you resolve that, you may get another CPU challenge.

How to avoid bottleneck performance?

A distributed system is described as scalable if it remains effective when there is a significant increase the number of resources and the number of users. However, these systems sometimes face performance bottlenecks. How can these be avoided?
The question is pretty broad, and depends entirely on what the system is doing.
Here are some things I've seen in systems to reduce bottlenecks.
Use caches, reducing network and disk bottlenecks. But remember that knowing when to evict from a cache eviction can a hard problem in some situations.
Use message queues to decouple components in the system. This way you can add more hardware to specific parts of the system that need it.
Delay computation when possible (often by using message queues). This takes the heat off the system during high-processing times.
Of course, design the system for parallel processing wherever possible. One host doing processing is NOT scalable. Note: most relational databases fall into the one-host bucket, this is why NoSQL has become suddenly popular; but not always appropriate (theoretically).
Use eventual consistency if possible. Strong consistency is much harder to scale.
Some are proponents for CQRS and DDD. Though I have never seen or designed a "CQRS system" nor a "DDD system," those have definitely affected the way I design systems.
There is a lot of overlap in the points above; some the techniques may use some of the others.
But, experience (your own and others) eventually teaches you about scalable systems. I keep up-to-date by reading about designs from google, amazon, twitter, facebook, and the like. Another good starting point is the high-scalability blog.
Just to build on a point discussed in the abover post, I would like to add that what you need for your distributed system is a distributed cache, so that when you intend on scaling your application the distributed cache acts like a "elastic" data-fabric meaning that you can increase storage capacity of the cache without compromising on performance and at the same time giving you a relaible platform that is accessible to multiple applications.
One such distributed caching solution is NCache. Do take a look!

LAMP stack performance under heavy traffic loads

I know the title of my question is rather vague, so I'll try to clarify as much as I can. Please feel free to moderate this question to make it more useful for the community.
Given a standard LAMP stack with more or less default settings (a bit of tuning is allowed, client-side and server-side caching turned on), running on modern hardware (16Gb RAM, 8-core CPU, unlimited disk space, etc), deploying a reasonably complicated CMS service (a Drupal or Wordpress project for arguments sake) - what amounts of traffic, SQL queries, user requests can I resonably expect to accommodate before I have to start thinking about performance?
NOTE: I know that specifics will greatly depend on the details of the project, i.e. optimizing MySQL queries, indexing stuff, minimizing filesystem hits - assuming web developers did a professional job - I'm really looking for a very rough figure in terms of visits per day, traffic during peak visiting times, how many records before (transactional) MySQL fumbles, so on.
I know the only way to really answer my question is to run load testing on a real project, and I'm concerned that my question may be treated as partly off-top.
I would like to get a set of figures from people with first-hand experience, e.g. "we ran such and such set-up and it handled at least this much load [problems started surfacing after such and such]". I'm also greatly interested in any condenced (I'm short on time atm) reading I can do to get a better understanding of the matter.
P.S. I'm meeting a client tomorrow to talk about his project, and I want to be prepared to reason about performance if his project turns out to be akin FourSquare.
Very tricky to answer without specifics as you have noted. If I was tasked with what you have to do, I would take each component in turn ( network interface, CPU/memory, physical IO load, SMP locking etc) and get the maximum capacity available, divide by rough estimate of use per request.
For example, network io. You might have 1x 1Gb card, which might achieve maybe 100Mbytes/sec. ( I tend to use 80% of theoretical max). How big will a typical 'hit' be? Perhaps 3kbytes average, for HTML, images etc. that means you can achieve 33k requests per second before you bottleneck at the physical level. These numbers are absolute maximums, depending on tools and skills you might not get anywhere near them, but nobody can exceed these maximums.
Repeat the above for every component, perhaps varying your numbers a little, and you will build a quick picture of what is likely to be a concern. Then, consider how you can quickly get more capacity in each component, can you just chuck $$ and gain more performance (eg use SSD drives instead of HD)? Or will you hit a limit that cannot be moved without rearchitecting? Also take into account what resources you have available, do you have lots of skilled programmer time, DBAs, or wads of cash? If you have lots of a resource, you can tend to reduce those constraints easier and quicker as you move along the experience curve.
Do not forget external components too, firewalls may have limits that are lower than expected for sustained traffic.
Sorry I cannot give you real numbers, our workloads are using custom servers, high memory caching and other tricks, and not using all the products you list. However, I would concentrate most on IO/SQL queries and possibly network IO, as these tend to be more hard limits, than CPU/memory, although I'm sure others will have a different opinion.
Obviously, the question is such that does not have a "proper" answer, but I'd like to close it and give some feedback. The client meeting has taken place, performance was indeed a biggie, their hosting platform turned out to be on the Amazon cloud :)
From research I've done independently:
Memcache is a must;
MySQL (or whatever persistent storage instance you're running) is usually the first to go. Solutions include running multiple virtual instances and replicate data between them, distributing the load;
http://highscalability.com/ is a good read :)

Is there a technique to predict performance impact of application

A customer is running a clustered web application server under considerable load. He wants to know if the upcoming application, which is not implemented yet, will still be manageable by his current setup.
Is there a established method to predict the performance impact of application in concept state, based on an existing requirement specification (or maybe a functional design specification).
First priority would be to predict the impact on CPU resource.
Is it possible to get fairly exact results at all?
I'd say the canonical answer is no. You always have to benchmark the actual application being deployed on its target architecture.
Why? Software and software development are not predictable. And systems are even more unpredictable.
Even if you know the requirements now and have done deep analysis what happens if:
The program has a performance bug (or two...) - which might even be a bug in a third-party library
New requirements are added or requirements change
The analysis and design don't spot all the hidden inter-relationships between components
There are non-linear effects of adding load and the new load might take the hardware over a critical threshold (a threshold that is not obvious now).
These concerns are not theoretical. If they were, SW development would be trivial and projects would always be delivered on time and to budget.
However there are some heuristics I personally used that you can apply. First you need a really good understanding of the current system:
Break the existing system's functions down into small, medium and large and benchmark those on your hardware
Perform a load test of these individual functions and capture thoughput in transactions/sec, CPU cost, network traffic and disk I/O figures for as many of these transactions as possible, making sure you have representation of small, medium and large. This load test should take the system up to the point where additional load will decrease transactions/sec
Get the figures for the max transactions/sec of the current system
Understand the rate of growth of this application and plan accordingly
Perform the analysis to get an 'average' small, medium and large 'cost' in terms of CPU, RAM, disk and network. This would be of the form:
Small transaction
CPU utilization: 10ms
RAM overhead 5MB (cache)
RAM working: 100kb (eg 10 concurrent threads = 1MB, 100 threads = 10MB)
Disk I/O: 5kb (database)
Network app<->DB: 10kb
Network app<->browser: 40kb
From this analysis you should understand how much headroom you have - CPU certainly, but check that there is sufficient RAM, network and disk capacity. Eg, the CPU required for small transactions is number of small transactions per second multiplied by the CPU cost of a small transaction. Add in the CPU cost of medium transactions and large ones, and you have your CPU budget.
Make sure the DBAs are involved. They need to do the same on the DB.
Now you need to analyse your upcoming application:
Assign each features into the same small, medium and large buckets, ensuring a like-for-like matching as far as possible
Ask deep, probing questions about how many transactions/sec each feature will experience at peak
Talk about the expected rate of growth of the application
Don't forget that the system may slow as the size of the database increases
On a personal note, you are being asked to predict the unpredictable - putting your name and reputation on the line. If you say it can fit, you are owning the risk for a large software development project. If you are being pressured to say yes, you need to ensure that there are many other people's names involved along with yours - and those names should all be visible on the go/no-go decision. Not only is this more likely to ensure that all factors are considered, and that the analysis is sound, but it will also ensure that the project has many involved individuals personally aligned to its success.

Optimal CPU utilization thresholds

I have built software that I deploy on Windows 2003 server. The software runs as a service continuously and it's the only application on the Windows box of importance to me. Part of the time, it's retrieving data from the Internet, and part of the time it's doing some computations on that data. It's multi-threaded -- I use thread pools of roughly 4-20 threads.
I won't bore you with all those details, but suffice it to say that as I enable more threads in the pool, more concurrent work occurs, and CPU use rises. (as does demand for other resources, like bandwidth, although that's of no concern to me -- I have plenty)
My question is this: should I simply try to max out the CPU to get the best bang for my buck? Intuitively, I don't think it makes sense to run at 100% CPU; even 95% CPU seems high, almost like I'm not giving the OS much space to do what it needs to do. I don't know the right way to identify best balance. I guessing I could measure and measure and probably find that the best throughput is achived at a CPU avg utilization of 90% or 91%, etc. but...
I'm just wondering if there's a good rule of thumb about this??? I don't want to assume that my testing will take into account all kinds of variations of workloads. I'd rather play it a bit safe, but not too safe (or else I'm underusing my hardware).
What do you recommend? What is a smart, performance minded rule of utilization for a multi-threaded, mixed load (some I/O, some CPU) application on Windows?
Yep, I'd suggest 100% is thrashing so wouldn't want to see processes running like that all the time. I've always aimed for 80% to get a balance between utilization and room for spikes / ad-hoc processes.
An approach i've used in the past is to crank up the pool size slowly and measure the impact (both on CPU and on other constraints such as IO), you never know, you might find that suddenly IO becomes the bottleneck.
CPU utilization shouldn't matter in this i/o intensive workload, you care about throughput, so try using a hill climbing approach and basically try programmatically injecting / removing worker threads and track completion progress...
If you add a thread and it helps, add another one. If you try a thread and it hurts remove it.
Eventually this will stabilize.
If this is a .NET based app, hill climbing was added to the .NET 4 threadpool.
UPDATE:
hill climbing is a control theory based approach to maximizing throughput, you can call it trial and error if you want, but it is a sound approach. In general, there isn't a good 'rule of thumb' to follow here because the overheads and latencies vary so much, it's not really possible to generalize. The focus should be on throughput & task / thread completion, not CPU utilization. For example, it's pretty easy to peg the cores pretty easily with coarse or fine-grained synchronization but not actually make a difference in throughput.
Also regarding .NET 4, if you can reframe your problem as a Parallel.For or Parallel.ForEach then the threadpool will adjust number of threads to maximize throughput so you don't have to worry about this.
-Rick
Assuming nothing else of importance but the OS runs on the machine:
And your load is constant, you should aim at 100% CPU utilization, everything else is a waste of CPU. Remember the OS handles the threads so it is indeed able to run, it's hard to starve the OS with a well behaved program.
But if your load is variable and you expect peaks you should take in consideration, I'd say 80% CPU is a good threshold to use, unless you know exactly how will that load vary and how much CPU it will demand, in which case you can aim for the exact number.
If you simply give your threads a low priority, the OS will do the rest, and take cycles as it needs to do work. Server 2003 (and most Server OSes) are very good at this, no need to try and manage it yourself.
I have also used 80% as a general rule-of-thumb for target CPU utilization. As some others have mentioned, this leaves some headroom for sporadic spikes in activity and will help avoid thrashing on the CPU.
Here is a little (older but still relevant) advice from the Weblogic crew on this issue: http://docs.oracle.com/cd/E13222_01/wls/docs92/perform/basics.html#wp1132942
If you feel your load is very even and predictable you could push that target a little higher, but unless your user base is exceptionally tolerant of periodic slow responses and your project budget is incredibly tight, I'd recommend adding more resources to your system (adding a CPU, using a CPU with more cores, etc.) over making a risky move to try to squeeze out another 10% CPU utilization out of your existing platform.

Resources