Why is my calculation taking so long on the same HPC cluster at my university? - performance

I am using vasp software to perform optimization calculatoin of three atoms normally it took few mint now it took long time I used to be able to complete these calculations in just a few minutes, but now they're taking over 8 hours. I've tried using the same files in a different folder without changing any parameters, but I'm still getting the same results. The administrator has no explanation for this issue. Has anyone else experienced this problem or have any suggestions for what might be causing it?"
I hope someone will answer my qurrery.

Related

testing runtimes of different versions of the same code

I've have been wondering about the time complexity of the programs that I run and I would like to know how I can compare their speeds. I usually like do questions on platforms like CodeAbbey and when I'm done I get the chance to see other peoples attempts of the problem as well. With the diversity, people have many different methods and my hope is to pick up hacks from what they have done to improve my skills. So is there some platform/website/app/program etc that I can use to to check which gets done first by running them simultaneously?
One other curious question iv been wondering about is:
does each program take exactly the same time to run everytime? like if I code something up and run it once, if I run it a second time, will it execute in the same time as before or will it be slightly different and the best I can hope for is an average?
I have tried to Google it but i don't really understand the results or how to use the sites. I just want to be able to see how much time it take to run.

How To Identify Inefficient Code Within a Function

I was recently asked this question during an interview & apart from console.log & debugger, I wasn't able to give more tools / options for this problem.
The question was: I am reviewing code & I find that the code is causing performance issues. The code is one very lengthy function. How would I go about identifying the line/s of code causing the performance issues.
Thinking about it, now, post the interview, the only other solution that comes to mind is to break the code into tinier functions & analyse. However, I wonder whats the best solution to this problem, not only for the interviews, it would help me be more mindful of options available when I do encounter this problem in real life. (unfortunately, the interviewer was quite non-communicative & clearly wanted to go through the questions as quickly as possible).
Thanks
One easy way to do this is to store the current time right before the infamous function call, then in that function, store the current time before and after strategic places where you think the culprit is. At the end, print the time that it took to execute these parts that are in between the strategic places and you will have a good idea of where the system is spending much of its time.

Severe GAE performance degradation all of sudden

We've seen calls start to take 10x to 100x longer as of a specific point in time (right about 2012-02-18 04:00:00). For example, our longest call that normally can take up to 5 seconds, is now timing out at a minute. I'm pretty sure it's not a result of a code or configuration change made on our part. All of our API calls seem to be impacted. Using AppStats, there seems to be no glaring culprit. (Unfortunately, I don't have a saved graph before Friday to compare to.) We do many gets/fetches, and some writing, against the high-replication datastore -- both seem impacted.
Any insight?
Best to file a production issue so it can be investigated.

Successful performance update to web app, but don't know why. How to find out?

this is kind of a strange title, so let me explain:
We have a web application (PHP, Zend Framework) that is quite successfull. Over time traffic grew and performance degraded (tens of requests with 80ms to tenthousands requests with >600ms average). We didn't expect so much traffic when first designing the application, so no big surprise. We decided to look into many things that could improve the performance.
After some days into the effort a production bug appeared that needed to be fixed. As the first changes we made to clean up some queries and caching code were already done and tested, we figured we could just add these to the update. None of the changes really improved the performance much in local testing and staging, but anyway.
But yeah, it did on production. Our graphes plunged to almost zero and we were totally destroyed that the update somehow made all the traffic disappear. But as we looked closer, the graphs were back to 80ms and almost invisible next to the 600ms mountains ;)
So we totally fixed the performance problems with some changes, we didn't even think would make a difference. Total success, but of course we want to understand which of these changes made the difference.
How would you tackle this problem?
Some background:
PHP application using Zend Framework, MySQL as database, Memcache for caching.
We get our performance graphs and insight into the application from NewRelic.com, but I can't really find the reason of the better performance there.
Using jMeter we could reproduce the bad performance on our dev servers, and also more or less the better performance of the updated version.
The only idea I have right now is to start with the old version, loadtest it, add one commit, loadtest it, add another feature, loadtest it... but this doesn't sound any fun or very effective.
Update: We found the reason for the performance problems, I will add an answer later to explain what we did and what the reason was. (Or how are updates and solutions handled to such questions?)
Update 2: Will add solution and way to find it as answer.
I think the easiest way would be to use XDebug or Zend Studio to debug your application.
Running it through the profiler will show you a breakdown of the execution flow, and all methods called, how long they took, and how much memory you used. The profiler should reveal if some block of code is called many times, or if there is something that simply takes a long time to execute sometimes.
If you do see 20ish millisecond responses from the profiler, then I would run a load tester in the background while I profiled on a different machine to see if heavy load seems to explain some of the time increases, and if so, what exactly is taking longer.
To me, that is the easiest way to see what is taking so long rather than loading different version of code and seeing how long they take. Doing it that way, you at least know which branch had the speed problem, but then you are still left to hunt down why it is taking so long as it may not be as simple as some piece of code being changed or optimized. It could be a combination of things.
I use Zend Studio for profiling and it is a huge time saver with that feature. XDebug's profiler is very similar AFIK.
Docs:
http://files.zend.com/help/Zend-Studio/profiling.htm
http://xdebug.org/docs/profiler
Ideally you need to profile the old version of the app and the new version of the app with the same realistic data but I somehow doubt you're going to have the time or inclination to do that.
What you could do is start by comparing the efficiency of the DB queries you've re-written against the previous versions, also look at how often they're called etc., and what effect the caching you've introduced has on that.
What I would also do is change the process going forward so that you introduce change as a flow (continuous integration/deployment style) so that you can see the impact of individual changes more clearly.
So what was the problem? Two additional ' in a MySQL query. They number value going into the method accidentally was a string, so the ORM used ' around it. Normally these problems are caught by the optimizer, but in this case it was a quite complicated combination of JOINs, perhaps that's why it was missed. Because this was also the most used query, every execution of it was a tiny bit slower - but that made all the difference in the end.
When you simply cannot optimize and locally scale any more, take a look here:
http://www.zend.com/en/products/php-cloud/

When is the optimization really worth the time spent on it?

After my last question, I want to understand when an optimization is really worth the time a developer spends on it.
Is it worth spending 4 hours to have queries that are 20% quicker? Yes, no, maybe, yes if...?
Is 'wasting' 7 hours to switch a task to another language to save about 40% of the CPU usage 'worth it'?
My normal iteration for a new project is:
Understand what the customer wants, and what he needs;
Plan the project: what languages and where, database design;
Develop the project;
Test and bug-fix;
Final analysis of the running project and final optimizations;
If the project requires it, a further analysis on the real-life usage of the resources followed by further optimization;
"Write good, maintainable code" is implied.
Obviously, the big 'optimization' part happens at point #2, but often when reviewing the code after the project is over I find some sections that, even if they do their job well, could be improved. This is the rationale for point #5.
To give a concrete example of the last point, a simple example is when I expect 90% of queries to be SELECT and 10% to be INSERT/UPDATE, so I charge a DB table with indexes. But after 6 months, I see that in real-life there are 10% SELECT queries and 90% INSERT/UPDATEs, so the query speed is not optimized. This is the first example that comes to my mind (and obviously this is more a 'patch' to an initial mis-design than an optimization ;).
Please note that I'm a developer, not a business-man - but I like to have a clear conscience by giving my clients the best, where possible.
I mean, I know that if I lose 50 hours to gain 5% of an application's total speed-up, and the application is used by 10 users, then it maybe isn't worth the time... but what about when it is?
When do you think that an optimization is crucial?
What formula do you usually apply, aware that the time spent optimizing (and the final gain) is not always quantifiable on paper?
EDIT: sorry, but i cant accept an answer like 'untill people dont complain about id, optimization is not needed'; It can be a business-view (questionable, imho), but not an developer or (imho too) a good-sense answer. I know, this question is really subjective.
I agree with Cheeso, the performance optimization should be deferred, after some analysis about the real-life usage and load of the project, but a small'n'quick optimization can be done immediatly after the project is over.
Thanks to all ;)
YAGNI. Unless people complain, a lot.
EDIT : I built a library that was slightly slower than the alternatives out there. It was still gaining usage and share because it was nicer to use and more powerful. I continued to invest in features and capability, deferring any work on performance.
At some point, there were enough features and performance bubbled to the top of the list I finally spent some time working on perf improvement, but only after considering the effort for a long time.
I think this is the right way to approach it.
There are (at least) two categories of "efficiency" to mention here:
UI applications (and their dependencies), where the most important measure is the response time to the user.
Batch processing, where the main indicator is total running time.
In the first case, there are well-documented rules about response times. If you care about product quality, you need to keep response times short. The shorter the better, of course, but the breaking points are about:
100 ms for an "immediate" response; animation and other "real-time" activities need to happen at least this fast;
1 second for an "uninterrupted" response. Any more than this and users will be frustrated; you also need to start think about showing a progress screen past this point.
10 seconds for retaining user focus. Any worse than this and your users will be pissed off.
If you're finding that several operations are taking more than 10 seconds, and you can fix the performance problems with a sane amount of effort (I don't think there's a hard limit but personally I'd say definitely anything under 1 man-month and probably anything under 3-4 months), then you should definitely put the effort into fixing it.
Similarly, if you find yourself creeping past that 1-second threshold, you should be trying very hard to make it faster. At a minimum, compare the time it would take to improve the performance of your app with the time it would take to redo every slow screen with progress dialogs and background threads that the user can cancel - because it is your responsibility as a designer to provide that if the app is too slow.
But don't make a decision purely on that basis - the user experience matters too. If it'll take you 1 week to stick in some async progress dialogs and 3 weeks to get the running times under 1 second, I would still go with the latter. IMO, anything under a man-month is justifiable if the problem is application-wide; if it's just one report that's run relatively infrequently, I'd probably let it go.
If your application is real-time - graphics-related for example - then I would classify it the same way as the 10-second mark for non-realtime apps. That is, you need to make every effort possible to speed it up. Flickering is unacceptable in a game or in an image editor. Stutters and glitches are unacceptable in audio processing. Even for something as basic as text input, a 500 ms delay between the key being pressed and the character appearing is completely unacceptable unless you're connected via remote desktop or something. No amount of effort is too much for fixing these kinds of problems.
Now for the second case, which I think is mostly self-evident. If you're doing batch processing then you generally have a scalability concern. As long as the batch is able to run in the time allotted, you don't need to improve it. But if your data is growing, if the batch is supposed to run overnight and you start to see it creeping into the wee hours of the morning and interrupting people's work at 9:15 AM, then clearly you need to work on performance.
Actually, you really can't wait that long; once it fails to complete in the required time, you may already be in big trouble. You have to actively monitor the situation and maintain some sort of safety margin - say a maximum running time of 5 hours out of the available 6 before you start to worry.
So the answer for batch processes is obvious. You have a hard requirement that the bast must finish within a certain time. Therefore, if you are getting close to the edge, performance must be improved, regardless of how difficult/costly it is. The question then becomes what is the most economical means of improving the process?
If it costs significantly less to just throw some more hardware at the problem (and you know for a fact that the problem really does scale with hardware), then don't spend any time optimizing, just buy new hardware. Otherwise, figure out what combination of design optimization and hardware upgrades is going to get you the best ROI. It's almost purely a cost decision at this point.
That's about all I have to say on the subject. Shame on the people who respond to this with "YAGNI". It's your professional responsibility to know or at least find out whether or not you "need it." Assuming that anything is acceptable until customers complain is an abdication of this responsibility.
Simply because your customers don't demand it doesn't mean you don't need to consider it. Your customers don't demand unit tests, either, or even reasonably good/maintainable code, but you provide those things anyway because it is part of your profession. And at the end of the day, your customers will be a lot happier with a smooth, fast product than with any of those other developer-centric things.
Optimization is worth it when it is necessary.
If we have promised the client response times on holiday package searches that are 5 seconds or less and that the system will run on a single Oracle server (of whatever spec) and the searches are taking 30 seconds at peak load, then the optimization is definitely worth it because we're not going to get paid otherwise.
When you are initially developing a system, you (if you are a good developer) are designing things to be efficient without wasting time on premature optimization. If the resulting system isn't fast enough, you optimize. But your question seems to be suggesting that there's some hand-wavey additional optimization that you might do if you feel that it's worth it. That's not a good way to think about it because it implies that you haven't got a firm target in mind for what is acceptable to begin with. You need to discuss it with the stakeholders and set some kind of target before you start worrying about what kind of optimizations you need to do.
Like everyone said in the other questions answers is, when it makes monetary sense to change something then it needs changing. In most cases good enough wins the day. If the customers aren't complaining then it is good enough. If they are complaining then fix it enough so they stop complaining. Agile methodologies will give you some guidance on how to know when enough is enough. Who cares if something is using 40% CPU more CPU than you think it needs to be, if it is working and the customers are happy then it is good enough. Really simple, get it working and maintainable and then wait for complaints that probably will never come.
If what you are worried about was really a problem, NO ONE would ever have started using Java to build mission critical server side applications. Or Python or Erlang or anything else that isn't C for that matter. And if they did that, nothing would get done in a time frame to even acquire that first customer that you are so worried about losing. You will know well in advance that you need to change something well before it becomes a problem.
Good posting everyone.
Have you looked at the unnecessary usage of transactions for simple SELECT(s)? I got burned on that one a few times... I also did some code cleanup and found MANY graphs being returned for maybe 10 records needed.... on and on... sometimes it's not YOUR code per se, but someone cutting corners.... Good luck!
If the client doesn't see a need to do performance optimization, then there's no reason to do it.
Defining a measurable performance requirements SLA with the client (i.e., 95% of queries complete in under 2 seconds) near the beginning of the project lets you know if you're meeting that goal, or if you have more optimization to do. Performance testing at current and estimated future loads gives you the data that you need to see if you're meeting the SLA.
Optimization is rarely worth it before you know what needs to be optimized. Always remember that if I/O is basically idle and CPU is low, you're not getting anything out of the computer. Obviously you don't want the CPU pegged all the time and you don't want to be running out of I/O bandwidth, but realize that trying to have the computer basically idle all day while it performs intense operations is unrealistic.
Wait until you start to reach a predefined threshold (80% utilization is the mark I usually use, others think that's too high/low) and then optimize at that point if necessary. Keep in mind that the best solution may be scaling up or scaling out and not actually optimizing the software.
Use Amdal's law. It shows you which is the overall improvement when optimizing a certain part of a system.
Also: If it ain't broke, don't fix it.
Optimization is worth the time spent on it when you get good speedups for spending little time in optimizing (obviously). To get that, you need tools/techniques that lead you very quickly to the code whose optimization yields the most benefit.
It is common to think that the way to find that code is by measuring the time spent by functions, but to my mind that only provides clues - you still have to play detective. What takes me straight to the code is stackshots. Here is an example of a 40-times speedup, achieved by finding and fixing several problems. Others on SO have reported speedup factors from 7 to 60, achieved with little effort.*
*(7x: Comment 1. 60x: Comment 30.)

Resources