I am looking for a software that makes me regulate the cpu load while rendering in blender on a MBP.
On some renders I have the cpu run at 95 % to 99 %.
The temps are in the high 90s.
I dont care if I wait a couple of minutes longer.
Or do I worry for nothing and this is "normal" behavior.
Thanks!
There are a couple ways you could handle this.
Use an application such as ATMonitor, which allows you to renice applications/processes to your liking. For example a niceness setting of −20 is the highest priority and 19 or 20 is the lowest priority. [info]
If you're more worried about the temperature of your cpu you could also check out smcFanControl, which can help lower cpu temps significantly by adjusting your fans rpms.
I recommend both apps, and the best part is that they are free.
Related
i have a kthread which runs alone on one core from a multi-core CPU. This kthread disables all IRQs for that core, runs a loop as fast as possible and measures the maximum loop duration with the help of the TSC. The whole ACPI stuff is disabled (no frequency scaling, no power saving, etc.).
My problem is, that the maximum loop duration apparently depends on the gpu.
When the system is used normal (a little bit office, Internet and programming stuff / not really busy) then the maximum loop duration is around 5 us :-(
The same situation, but with a stressed CPU (the other three cores are 100% busy) leads to a maximum loop duration of approximately 1 us :-|
But when the GPU is switching into idle mode (turning-off the screen), then the maximum loop duration is going down to less than 300 ns :-)
Why is that? And how can i influence this behavior? I thought the CPU and the RAM are directly connected. I recognized, that the maximum loop duration becomes better on a system with a external graphic card for the first situation. For the second and third case i couldn't see a difference. I also tested AMD and Intel systems without success - always the same :-(
I'm fine with the second case. But is it possible to achieve that without stressing the CPU additionally?
Many thanks in advance!
Billy
Look at the those peaks in the first graph, which factor can cause this?
cpu 24X6
There's a lot of stuff going on in any general purpose computer. When I performance profiled apps in a former life, I saw this all the time and factored it out.
It's caused by a whole host of sources: Processor dealing with interrupts, some disk maintenance routine, file system clean up, completely useless background apps that have been installed unknown to you as automatically launched services, etc.
Your plot of idle time is a little disconcerting. It is awfully low. What apps do you have running taking up all that processing? Also, if your memory is low, say because you have 20 or 30 browser tabs/windows open, your CPU load will go through the roof due to all that page and context swapping.
When using the desktop PC's in my university (Which have 4Gb of ram), calculations in Matlab are fairly speedy, but on my laptop (Which also has 4Gb of ram), the exact same calculations take ages. My laptop is much more modern so I assume it also has a similar clock speed to the desktops.
For example, I have written a program that calculates the solid angle subtended by 50 disks at 500 points. On the desktop PC's this calculation takes about 15 seconds, on my laptop it takes about 5 minutes.
Is there a way to reduce the time taken to perform these calculations? e.g, can I allocate more ram to MATLAB, or can I boot up my PC in a way that optimises it for using MATLAB? I'm thinking that if the processor on my laptop is also doing calculations to run other programs this will slow down the MATLAB calculations. I've closed all other applications, but I know theres probably a lot of stuff going on I can't see. Can I boot my laptop up in a way that will have less of these things going on in the background?
I can't modify the code to make it more efficient.
Thanks!
You might run some of my benchmarks which, along with example results, can be found via:
http://www.roylongbottom.org.uk/
The CPU core used at a particular point in time, is the same on Pentiums, Celerons, Core 2s, Xeons and others. Only differences are L2/L3 cache sizes and external memory bus speeds. So you can compare most results with similar vintage 2 GHz CPUs. Things to try, besides simple number crunching tests.
1 - Try memory test, such as my BusSpeed, to show that caches are being used and RAM not dead slow.
2 - Assuming Windows, check that the offending program is the one using most CPU time in Task Manager, also that with the program not running, that CPU utilisation is around zero.
3 - Check that CPU temperature is not too high, like with SpeedFan (free D/L).
4 - If disk light is flashing, too much RAM might be being used, with some being swapped in and out. Task Manager Performance would show this. Increasing RAM demands can be checked my some of my reliability tests.
There are many things that go into computing power besides RAM. You mention processor speed, but there is also number of cores, GPU capability and more. Programs like MATLAB are designed to take advantage of features like parallelism.
Summary: You can't compare only RAM between two machines and expect to know how they will perform with respect to one another.
Side note: 4 GB is not very much RAM for a modern laptop.
Firstly you should perform a CPU performance benchmark on both computers.
Modern operating systems usually apply the most aggressive power management schemes when it is run on laptop. This usually means turning off one or more cores, or setting them to a very low frequency. For example, a Quad-core CPU that normally runs at 2.0 GHz could be throttled down to 700 MHz on one CPU while the other three are basically put to sleep, while it is on battery. (Remark. Numbers are not taken from a real example.)
The OS manages the CPU frequency in a dynamic way, tweaking it on the order of seconds. You will need a software monitoring tool that actually asks for the CPU frequency every second (without doing busy work itself) in order to know if this is the case.
Plugging in the laptop will make the OS use a less aggressive power management scheme.
(If this is found to be unrelated to MATLAB, please "flag" this post and ask moderator to move this question to the SuperUser site.)
The background is thus: next week our office will have one day with no heating, due to maintenance. Outdoor temperature is expected between 7 and 12 degrees Celcius, so it might become chilly. The portable electric heaters are too few to cater for everyone.
However, I have, in my office of about 6-8 m2, a big honkin' (3 yrs old) workstation (HP xw8600 with 3.0 GHz Quad-core Xeon) that should be able output a couple of hundred Watts of heat. Running Furmark will max out the GPU but I'm not sure how to best work the CPU.
Last time I was in a cold office I either compiled more often or just launched 4-8 DOSBox:es running Norton Commander, but I think one can do better by using SSE1-2-3-4,MMX etc, i.e. stuff that does more work per cycle.
So, what CPU instructions toggle the most transistors each cycle, and thus use cause the CPU to draw most amount of power and thus give off maximum heat?
If I had a power meter available, then I could benchmark myself, but I figure this would be a fun challenge for the SO-crowd. :)
For your specific goal, if you really want to use your system as a heat generator, you need to first make sure that the cooling system is working really well (throwing the heat out of the box). Processors today are designed to throttle themselves when they reach a critical temperature which happens when a proper heatsink is used and the processor is at TDP (Thermal Design Power is the max power for the processor using normal programs). If you have a better heat sink and good ventilation (box fan?), you can probably get beyond TDP assuming that your power supply can handle it. If you turn the fan off, you basically will hit the thermal limit right away.
To be more explicit, the individual instructions that burn the most are generally load instructions that miss in the caches and go out to memory. To guarantee misses, you'll want to allocate a chunk of memory that's bigger than the last level CPU cache and hop around that memory. The pattern of hopping in the maximum power case is a bit complex because you're trying to get the maximum number of misses outstanding at every level of the cache hierarchy simultaneously. If you have 3 levels of cache, in a given period of time, you can have more misses to the L1 than you can to the L2 than you can to the L3 than you can to the DRAM page. (And the specific design of your processor will have a total limit on misses.) Between misses, the instruction doesn't matter too much, but I'd guess that one of the SSE4 multiplies (PMULUDQ) is probably the best since on a lot of modern processors, they execute in pretty quickly and generally do a whole lot of work (compared to say an add).
The funny thing is, running the GPU may limit the amount of heat that you can generate using misses to the L3 cache since the memory may be bogged down by the GPU. In that case, you should make sure that all accesses to the L3 are hits, but that you're missing in the other levels.
For GeForce graphics, my CudaMFLOPS program (free) is quite handy for obtaining high temperatures on the graphics card. If you have an appropriate card details are in:
http://www.roylongbottom.org.uk/cuda1.htm#anchor8
I find that my tests that execute SSE instructions with data from L1 cache generally produce the highest CPU temperatures.
For cpu use Prime95. That is lightweight and will load up all cores nicely. You aren't really going to get much heat out of a 3ghz xeon though. Chips of that age are usually good for over 4ghz with average cooling, and close to 5ghz with high end water loops. With a 6-core chip # >4ghz with extra voltage added you might be hitting 200w TDP but with that system you will be lucky to get the cpu to 100w.
As for the GPU, the Heaven Benchmark is a good one for quickly getting it up to temperature. Again, unless you have a high end card a couple of hundred watts of heat is unlikely. Another alternative on AMD gpus (maybe nvidia too?) is to use crpto-currency mining software, maybe get a USB stick with a mining linux distribution installed and ready to go. You could also use Prime95 on the same rig as mining software uses very little cpu time.
I actually kept a couple of rooms warm over winter with the heat from a computer, only rarely needing extra heating. This was done with a crypto-currency mining rig, which had 4 gpus running at ~80 degrees C, 24/7, with a box fan to circulate the air round the room. That rig had a 1300W PSU. Might I suggest that instead of trying to use the computer to keep you warm, you wear more clothes?
I have built software that I deploy on Windows 2003 server. The software runs as a service continuously and it's the only application on the Windows box of importance to me. Part of the time, it's retrieving data from the Internet, and part of the time it's doing some computations on that data. It's multi-threaded -- I use thread pools of roughly 4-20 threads.
I won't bore you with all those details, but suffice it to say that as I enable more threads in the pool, more concurrent work occurs, and CPU use rises. (as does demand for other resources, like bandwidth, although that's of no concern to me -- I have plenty)
My question is this: should I simply try to max out the CPU to get the best bang for my buck? Intuitively, I don't think it makes sense to run at 100% CPU; even 95% CPU seems high, almost like I'm not giving the OS much space to do what it needs to do. I don't know the right way to identify best balance. I guessing I could measure and measure and probably find that the best throughput is achived at a CPU avg utilization of 90% or 91%, etc. but...
I'm just wondering if there's a good rule of thumb about this??? I don't want to assume that my testing will take into account all kinds of variations of workloads. I'd rather play it a bit safe, but not too safe (or else I'm underusing my hardware).
What do you recommend? What is a smart, performance minded rule of utilization for a multi-threaded, mixed load (some I/O, some CPU) application on Windows?
Yep, I'd suggest 100% is thrashing so wouldn't want to see processes running like that all the time. I've always aimed for 80% to get a balance between utilization and room for spikes / ad-hoc processes.
An approach i've used in the past is to crank up the pool size slowly and measure the impact (both on CPU and on other constraints such as IO), you never know, you might find that suddenly IO becomes the bottleneck.
CPU utilization shouldn't matter in this i/o intensive workload, you care about throughput, so try using a hill climbing approach and basically try programmatically injecting / removing worker threads and track completion progress...
If you add a thread and it helps, add another one. If you try a thread and it hurts remove it.
Eventually this will stabilize.
If this is a .NET based app, hill climbing was added to the .NET 4 threadpool.
UPDATE:
hill climbing is a control theory based approach to maximizing throughput, you can call it trial and error if you want, but it is a sound approach. In general, there isn't a good 'rule of thumb' to follow here because the overheads and latencies vary so much, it's not really possible to generalize. The focus should be on throughput & task / thread completion, not CPU utilization. For example, it's pretty easy to peg the cores pretty easily with coarse or fine-grained synchronization but not actually make a difference in throughput.
Also regarding .NET 4, if you can reframe your problem as a Parallel.For or Parallel.ForEach then the threadpool will adjust number of threads to maximize throughput so you don't have to worry about this.
-Rick
Assuming nothing else of importance but the OS runs on the machine:
And your load is constant, you should aim at 100% CPU utilization, everything else is a waste of CPU. Remember the OS handles the threads so it is indeed able to run, it's hard to starve the OS with a well behaved program.
But if your load is variable and you expect peaks you should take in consideration, I'd say 80% CPU is a good threshold to use, unless you know exactly how will that load vary and how much CPU it will demand, in which case you can aim for the exact number.
If you simply give your threads a low priority, the OS will do the rest, and take cycles as it needs to do work. Server 2003 (and most Server OSes) are very good at this, no need to try and manage it yourself.
I have also used 80% as a general rule-of-thumb for target CPU utilization. As some others have mentioned, this leaves some headroom for sporadic spikes in activity and will help avoid thrashing on the CPU.
Here is a little (older but still relevant) advice from the Weblogic crew on this issue: http://docs.oracle.com/cd/E13222_01/wls/docs92/perform/basics.html#wp1132942
If you feel your load is very even and predictable you could push that target a little higher, but unless your user base is exceptionally tolerant of periodic slow responses and your project budget is incredibly tight, I'd recommend adding more resources to your system (adding a CPU, using a CPU with more cores, etc.) over making a risky move to try to squeeze out another 10% CPU utilization out of your existing platform.