Now the machines we are forced to use are 2GB Ram, Intel Core 2 Duo E6850 # 3GHz CPU...
The policy within the company is that everyone has the same computer no matter what and that they are on a 3 year refresh cycle... Meaning I will have this machine for the next 2 years... :S
We have been complaining like crazy but they said they want proof that upgrading the machines will provide exactly X time saving before doing anything... And with that they are only semi considering giving us more RAM...
Even when you put forward that developer resources are much more expensive than hardware, they firstly say go away, then after a while they say prove it. As far as they are concerned paying wages comes from a different bucket of money to the machines and that they don't care (i.e. the people who can replace the machines, because paying wages doesn't come from their pockets)...
So how can I prove that $X benefit will be gained by spending $Y on new hardware...
The stack I'm working with is as follows: VS 2008, SQL 2005/2008. As duties dictate we are SQL admins as well as Web/Winform/WebService Developers. So its very typical to have 2 VS sessions and at least one SQL session open at the same time.
Cheers
Anthony
Actually, the main cost for your boss is not the lost productivity. It is that his developers don't enjoy their working conditions. This leads to:
loss of motivation and productivity
more stress causing illness
external opportunities causing developers to go away
That sounds like a decent machine for your stack. Have you proven to yourself that you're going to get better performance, using real-world tests?
Check with your IT people to see if you can get the disks benchmarked, and max out the memory. Management should be more willing to take these incremental steps first.
The machine looks fine apart from the RAM.
If you want to prove this sort of thing time all the things you wait for (typically load times and compile times), add it all up and work how much it costs you to sit around. From that make some sort of guess how much time you'll save (it'll have to be a guess unless you can compare like with like, which is difficult if they won't upgrade your systems). You'll probably find that they'll make the money back on the RAM at least in next to no time - and that's before you even begin to factor in the loss of productivity from people's minds wandering whilst they wait for stuff to happen.
Unfortunately if they're skeptical then it's unlikely you can prove it to them in a quantitative way alone. Even if you came up with numbers, they'll probably question the methodology. I suggest you see if they're willing to watch a 10 minute demo (maybe call it a presentation), and show them the experience of switching between VS instances (while explaining why you'd need to switch and how often), show them the build process (again explaining why you'd need to create a build and how often), etc.
Ask them if you're allowed to bring your own hardware. If you're really convinced it would make you more productive, upgrade it yourself and when you start producing more ask for a raise or to be reimbursed.
Short of that though..
I have to ask: what else are you running? I'm not really that familiar with that stack, but it really shouldn't be that taxing. Are they forcing you to run some kind of system-slowing monitoring or antivirus app?
You'd probably have better luck convincing them to let you change that than getting them to roll out new updates.
If you really must convince them, your best bet is to benchmark your machine as accurately as you can and price out exactly what you need upgraded. Its a lot easier to get them to agree to an exact (and low) dollar amount than some open-ended upgrade
Even discussion this with them for more than five minutes will cost more than just calling out to your local PC dealer and buy the RAM out of your own pocket. Ask you project lead whether they can put it on the tab of the project as another "development tool". If (s)he can't, don't bother and cough up the
When they come complaining, then put the time of the meetings for this on their budget (since they come crying). See how long they can take this.
When we had the same issue, my boss bought better gfx cards for the whole team out of his own pockets and went to the PC guys to get each of us a second monitor. A few days later, he went again to get each of us 2GB more RAM, too.
The main cost from slow developer machines comes from the slow builds and the 'context switching', ie the time that it takes you to switch between the tasks required of you:
Firing up the second instance of VS and waiting for it to load and build
Checking out or updating a source tree
Starting up another instance of VS or checking out a clean source tree to 'have a quick look at' some bug that's been assigned
Multiple build/debug cycles to fix difficult bugs
The mental overhead in switching between different tasks, which shouldn't be underestimated
I made a case a while ago for new hardware after doing a breakdown of the amount of time that was wasted waiting for the machine to catch up. In a typical day we might need to do 2 or 3 full builds at half an hour each. The link time was around 3 minutes, and in a build/debug cycle you might do that 40 times a day. So that's 3.5 hours a day waiting for the machine. The bulk of that is in small 2 or 3 minute pockets which isn't long enough for you to context switch and do something else. It's long enough to check your mail, check stackoverflow, blow your nose and that's about it. So there's nothing else productive you can do with that time.
If you can show that a new machine will build the full project in 15 minutes and link in 1 minute then that's theoretically given you an extra 2 hours of productivity a day (or more realistically, the potential for more build cycles).
So I would get some objective timings that show how long it takes for different parts of your work cycle, then try to do comparative timings on machines with 4GB of RAM, a second drive (eg something fast like a WD Raptor), an SSD, whatever, to come up with some hard figures to support your case.
EDIT: I forgot to mention: present this as your current hardware is making you lose productivity, and put a cost on the amount of time lost by multiplying it by a typical developer hourly rate. On this basis I was able to show that a new PC would pay for itself in about month.
Take a task you do regularly that would be improved with faster hardware - ex: running the test suite, running a build, booting and shutting down a virtual machine - and measure the time it takes with current hardware and with better hardware.
Then compute the monthly, or yearly cost: how many times per month x time gained x hourly salary, and see if this is enough to make a case.
For instance, suppose you made $10,000/month, and gained 5 minutes a day with a better machine, the loss to your company per month would be around (5/60 hours lost a day) x 20 work days/month x $10,000 / 8 hours/day = $105 / month. Or about $1200/year lost because of the machine (assuming I didn't mess up the math...). Now before talking to your manager, think about whether this number is significant.
Now this is assuming that 1) you can measure the improvement, even though you don't have a better machine, and 2) while you are "wasting" your 5 minutes a day, you are not doing anything productive, which is not obvious.
For me, the cost of a slow machine is more psychological, but it's hard to quantify - after a few days of having to wait for things to happen on the PC, I begin to get cranky, which is both bad for my focus, and my co-workers!
It’s easy; hardware is cheap, developers are expensive. Throwing reasonable amounts of money at the machinery should be an absolute no brainer and if your management doesn’t understand that and won’t be guided by your professional opinion then you might be in the wrong job.
As for your machine, throw some more RAM at it and use a fast disk (have a look at how intensive VS is on disk IO using the resource monitor – it’s very hungry). Lots of people going towards 10,000 RPM or even SSD these days and they make a big difference to your productivity.
Try this; take the price of the hardware you need (say fast disk and more RAM), split it across a six month period (a reasonable time period in which to recoup the investment) and see what it’s worth in “developer time” each day. You’ll probably find it only needs to return you a few minutes a day to pay for itself. Once again, if your management can’t understand or support this then question if you’re in the right place.
Related
I know the title of my question is rather vague, so I'll try to clarify as much as I can. Please feel free to moderate this question to make it more useful for the community.
Given a standard LAMP stack with more or less default settings (a bit of tuning is allowed, client-side and server-side caching turned on), running on modern hardware (16Gb RAM, 8-core CPU, unlimited disk space, etc), deploying a reasonably complicated CMS service (a Drupal or Wordpress project for arguments sake) - what amounts of traffic, SQL queries, user requests can I resonably expect to accommodate before I have to start thinking about performance?
NOTE: I know that specifics will greatly depend on the details of the project, i.e. optimizing MySQL queries, indexing stuff, minimizing filesystem hits - assuming web developers did a professional job - I'm really looking for a very rough figure in terms of visits per day, traffic during peak visiting times, how many records before (transactional) MySQL fumbles, so on.
I know the only way to really answer my question is to run load testing on a real project, and I'm concerned that my question may be treated as partly off-top.
I would like to get a set of figures from people with first-hand experience, e.g. "we ran such and such set-up and it handled at least this much load [problems started surfacing after such and such]". I'm also greatly interested in any condenced (I'm short on time atm) reading I can do to get a better understanding of the matter.
P.S. I'm meeting a client tomorrow to talk about his project, and I want to be prepared to reason about performance if his project turns out to be akin FourSquare.
Very tricky to answer without specifics as you have noted. If I was tasked with what you have to do, I would take each component in turn ( network interface, CPU/memory, physical IO load, SMP locking etc) and get the maximum capacity available, divide by rough estimate of use per request.
For example, network io. You might have 1x 1Gb card, which might achieve maybe 100Mbytes/sec. ( I tend to use 80% of theoretical max). How big will a typical 'hit' be? Perhaps 3kbytes average, for HTML, images etc. that means you can achieve 33k requests per second before you bottleneck at the physical level. These numbers are absolute maximums, depending on tools and skills you might not get anywhere near them, but nobody can exceed these maximums.
Repeat the above for every component, perhaps varying your numbers a little, and you will build a quick picture of what is likely to be a concern. Then, consider how you can quickly get more capacity in each component, can you just chuck $$ and gain more performance (eg use SSD drives instead of HD)? Or will you hit a limit that cannot be moved without rearchitecting? Also take into account what resources you have available, do you have lots of skilled programmer time, DBAs, or wads of cash? If you have lots of a resource, you can tend to reduce those constraints easier and quicker as you move along the experience curve.
Do not forget external components too, firewalls may have limits that are lower than expected for sustained traffic.
Sorry I cannot give you real numbers, our workloads are using custom servers, high memory caching and other tricks, and not using all the products you list. However, I would concentrate most on IO/SQL queries and possibly network IO, as these tend to be more hard limits, than CPU/memory, although I'm sure others will have a different opinion.
Obviously, the question is such that does not have a "proper" answer, but I'd like to close it and give some feedback. The client meeting has taken place, performance was indeed a biggie, their hosting platform turned out to be on the Amazon cloud :)
From research I've done independently:
Memcache is a must;
MySQL (or whatever persistent storage instance you're running) is usually the first to go. Solutions include running multiple virtual instances and replicate data between them, distributing the load;
http://highscalability.com/ is a good read :)
I am impressed with Visual Studio Profiler for performance analysis. Fast for my purposes and easy to use.
I am just curious to know about the caveats in visual studio profiler. Are there any better profilers for windows applications which fare better for these caveats?
On the positive side, nobody makes great apps like Microsoft. Visual Studio is a fine product, and its profiler shares those attributes.
On the other hand, there are caveats (shared by other profilers as well).
In sampling mode, it doesn't sample when the thread is blocked. Therefore it is blind to extraneous I/O, socket calls, etc. This is an attribute that dates from the early days of prof and gprof, which started out as PC samplers, and since when blocked the PC is meaningless, sampling was turned off. The PC may be meaningless, but the stack tells exactly why the thread is blocked and, when there is much time going into that, you need to know it.
In instrumentation mode, it can include I/O, but it only gives you function-level percent of time, not line level. That may be OK if functions happen to be small, or if they only call each other in a small number of places, so finding call sites is not too hard. I work with good programmers, but our code is not all like that. In fact, often the call sites are invisible, because they are compiler-inserted. On the other hand, stack samples pinpoint those calls no matter who wrote them.
The profiler does a nice job of showing you the split between activity of different threads. Then what you need to know is, if a thread is suspended or showing a low processor activity, is that because it is blocking for something that it doesn't really have to? Stack samples could tell you that if they could be taken during blocking. On the other hand, if a thread is cranking heavily, do you know if what it is doing is actually necessary or could be reduced? Stack samples will tell you that also.
Many people think the primary job of a profiler is to measure. Personally, I want something that pinpoints code that costs a lot of time and can be done more efficiently. Most of the time these are function call sites, not "hot spots". I don't need to know "a lot of time" with any precision. It I know it is, say, 60% +/- 20% that's perfectly fine with me because I'm looking for the problem, not the measurement. If because of this imprecision, I fix a problem which is not the largest, that's OK, because when I repeat the process, the largest problem will be even bigger, as a percent, so I won't miss it.
( Please provide the question this one duplicates. I'm disappointed I couldn't find it. )
My development machine is "slow". I wait on it "a lot".
I've been asked by decision makers who want to help to fairly and accurately measure that time. How do you quantify the amount of time you spend waiting on the computer (during compiles, waiting for apps to open every day, etc).
Is there software which effectively reports on this sort of thing? Is there an OS metric (I/O something something, pagefile swapping frequency, etc, etc) that captures and communicates this particularly well? Some sort of benchmark you'd recommend me testing against?
EDIT: I'm writing C# (mostly ASP.NET).
Here's one metric that may impress some higher ups: measure the average time it takes to build your application, and how many times you do that per day. For instance, We ended up with ~100 builds a day # 60 secs each.
Now, measure the average build time on a presumably faster machine (say 30 secs per build).
At this point you can see how much time it would save you to have the 'faster' machine. Per-developer, per-day. Multiply by the number of developers, and the days in a month and you can see how this stacks against adding another developer to the team.
Yes, I know, there are other considerations when adding more people to a team, but this will give you a rough comparison that 'higher ups' can relate to. For instance: if we all had faster machines, we would spend less time on the builds, comparable to one extra developer.
On the other side, you should provide good estimates of the cost of upgrading everyone's machine.
Now, if you can, you should run this type of comparison against multiple 'faster' machines to determine their relative performance and perhaps individuate which bottlenecks you are facing (RAM vs CPU vs I/O ?).
Finally, my personal take is that, while this sort of process and the following discussion with the stakeholders takes place (and it may take a while), you could get everyone bigger/more monitors. That's a relatively cheap upgrade (of course, not that cheap if you go for 52" LCD monitors, right?) and more monitor-estate does improve productivity (protip: also improves employees morale, which, in turn, improves productivity).
HTH
Close FireFox to gain some memory. Add RAM. Helped me a lot.
Depends on your work environment. E.g. in Visual Studio (C++, 2005) you can do timed builds, such that the IDE prints the elapsed time after the regular build output.
Quantification is difficult when you don't have anything to measure/compare against. If your dev-box takes 12 minutes to compile a project of 100,000 lines of code without any other dev-box to measure against you have no idea if this is good or bad. Maybe 12 minutes for 100,000 lines is actually good?
Measuring it won't help you and it certainly won't help your decision makers. Consider; "Yes boss, it takes an average of twelve minutes to compile our project." The boss says; "Ok, is that normal?". You have no idea.
Computer hardware is cheap. Look at the dev-box and consider asking the decision makers to throw some cash at it to improve its performance. If you compile on average 5 times a day and it takes an average of 12 minutes a compile, that's a lost hour every single day - adding up to 5 lost hours a week. Well worth the cost of some RAM or a CPU upgrade.
To me, a slow machine does not kill productivity as much as unexpected slowing down does - if the machine compiles the whole solution in 12 minutes every time you press F5, the solution has some problem, not the machine. Beside that, I don't have problem with 12 minutes, I can get up take a break. It's actually good to take a break when you know and have control on how long the break will be.
What I found the most productivity killer are these cooperate software that start scanning virus (or install updates) at their will - having to sit there and wait is a pain of ass.
All too often I read statements about some new framework and their "benchmarks." My question is a general one but to the specific points of:
What approach should a developer take to effectively instrument code to measure performance?
When reading about benchmarks and performance testing, what are some red-flags to watch out for that might not represent real results?
There are two methods of measuring performance: using code instrumentation and using sampling.
The commercial profilers (Hi-Prof, Rational Quantify, AQTime) I used in the past used code instrumentation (some of them could also use sampling) and in my experience, this gives the best, most detailed result. Especially Rational Quantity allow you to zoom in on results, focus on sub trees, remove complete call trees to simulate an improvement, ...
The downside of these instrumenting profilers is that they:
tend to be slow (your code runs about 10 times slower)
take quite some time to instrument your application
don't always correctly handle exceptions in the application (in C++)
can be hard to set up if you have to disable the instrumentation of DLL's (we had to disable instrumentation for Oracle DLL's)
The instrumentation also sometimes skews the times reported for low-level functions like memory allocations, critical sections, ...
The free profilers (Very Sleepy, Luke Stackwalker) that I use use sampling, which means that it is much easier to do a quick performance test and see where the problem lies. These free profilers don't have the full functionality of the commercial profilers (although I submitted the "focus on subtree" functionality for Very Sleepy myself), but since they are fast, they can be very useful.
At this time, my personal favorite is Very Sleepy, with Luke StackWalker coming second.
In both cases (instrumenting and sampling), my experience is that:
It is very difficult to compare the results of profilers over different releases of your application. If you have a performance problem in your release 2.0, profile your release 2.0 and try to improve it, rather than looking for the exact reason why 2.0 is slower than 1.0.
You must never compare the profiling results with the timing (real time, cpu time) results of an application that is run outside the profiler. If your application consumes 5 seconds CPU time outside the profiler, and when run in the profiler the profiler reports that it consumes 10 seconds, there's nothing wrong. Don't think that your application actually takes 10 seconds.
That's why you must consistently check results in the same environment. Consistently compare results of your application when run outside the profiler, or when run inside the profiler. Don't mix the results.
Also use a consistent environment and system. If you get a faster PC, your application could still run slower, e.g. because the screen is larger and more needs to be updated on screen. If moving to a new PC, retest the last (one or two) releases of your application on the new PC so you get an idea on how times scale to the new PC.
This also means: use fixed data sets and check your improvements on these datasets. It could be that an improvement in your application improves the performance of dataset X, but makes it slower with dataset Y. In some cases this may be acceptible.
Discuss with the testing team what results you want to obtain beforehand (see Oded's answer on my own question What's the best way to 'indicate/numerate' performance of an application?).
Realize that a faster application can still use more CPU time than a slower application, if the faster one uses multi-threading and the slower one doesn't. Discuss (as said before) with the testing time what needs to be measured and what doesn't (in the multi-threading case: real time instead of CPU time).
Realize that many small improvements may lead to one big improvement. If you find 10 parts in your application that each take 3% of the time and you can reduce it to 1%, your application will be 20% faster.
It depends what you're trying to do.
1) If you want to maintain general timing information, so you can be alert to regressions, various instrumenting profilers are the way to go. Make sure they measure all kinds of time, not just CPU time.
2) If you want to find ways to make the software faster, that is a distinctly different problem.
You should put the emphasis on the find, not on the measure.
For this, you need something that samples the call stack, not just the program counter (over multiple threads, if necessary). That rules out profilers like gprof.
Importantly, it should sample on wall-clock time, not CPU time, because you are every bit as likely to lose time due to I/O as due to crunching. This rules out some profilers.
It should be able to take samples only when you care, such as not when waiting for user input. This also rules out some profilers.
Finally, and very important, is the summary you get.
It is essential to get per-line percent of time.
The percent of time used by a line is the percent of stack samples containing the line.
Don't settle for function-only timings, even with a call graph.
This rules out still more profilers.
(Forget about "self time", and forget about invocation counts. Those are seldom useful and often misleading.)
Accuracy of finding the problems is what you're after, not accuracy of measuring them. That is a very important point. (You don't need a large number of samples, though it does no harm. The harm is in your head, making you think about measuring, rather than what is it doing.)
One good tool for this is RotateRight's Zoom profiler. Personally I rely on manual sampling.
I always wondered what different methods Google Desktop Search is using so that it uses least CPU and memory while indexing a computer containing more 100,000 files on an average.
In just few hours it has indexed the whole system and I did not see it eating up my CPU, memory etc.
If any of you have done some research, please do share.
The trick is simple: It starts to work then very soon stops and just sits there in in memory, doing nothing. Of course it's then totally useless but at least, it keeps light and fast. Sorry, couldn't resist :-) I Switched to Windows Search 4.0 and I'm much happier about it.
It doesn't...
I installed it on one computer, and quickly removed it because it was intrusive (although this can be probably configured) and hungry (particularly on a low end PC).
It is installed on a laptop near me right now, and if I compare it to a couple of small utilities I run permanently (SlickRun, CLCL, my AutoHotkey script...) it uses more than 10 times their CPU and 5 to 20 times their memory. Times two, since, for some reason, I have one instance running another, plus the ToolbarNotifier (less hungry).
Even Trend Micro anti-virus uses less memory and CPU.
Perhaps I will try it again when I will get a more modern PC with lot of memory, but right now I am happy enough with some grep utilities, even if they are slower.
Take a look at disk usage. If you build many keys/indexes you will use lots of disk space and the searches will be fast.
For example;
30 gig drive 75% used. 3.6 gig used for 2 instances of Google Desktop. (roaming profiles suck)
Once it has done the initial index, and written it to disc, it doesn't need to anything.
Searching using the index will require very little resources, the only thing that will is indexing new or modified files..