This happened to be one of my class test question.
In a demand paging system, the CPU utilization is 20% and the paging disk utilization is 97.7%
If the CPU speed is increased, will the CPU usage be increased in this scenario?
Paging is effectively a bottleneck in this example. The amount of computation per unit time might increase slightly with a faster CPU but not in proportion to the increase in CPU speed (so the percentage utilization would decrease).
A quick and dirty estimation would use Amdahl's Law. In the example, 80% of the work is paging and 20% is CPU-limited, so an N-fold improvement in CPU performance would result in a speedup factor of 1/((1 - 0.2) + (0.2/N)).
A more realistic estimate would add an awareness of queueing theory to recognize that if the paging requests came in more frequently the utilization would actually increase even with a fixed buffer size. However, the increase in paging utilization is smaller than the increase in request frequency.
Without looking at the details of queueing theory, one can also simply see that the maximum potential improvement in paging is just over 2%. (If paging utilization was driven up to 100%: 100/97.7 or 1.0235.) Even at 100% paging utilization, paging would take 0.80/(100/97.7) of the original time, so clearly there is not much opportunity for improvement.
If a 10-fold CPU speed improvement drove paging utilization to effectively 100%, every second of work under the original system would use 781.6 milliseconds in paging (800 ms / (100/97.7)) and 20 milliseconds in the CPU (200 ms / 10). CPU utilization would decrease to 20 / (781.6 + 20) or about 2.5%.
Related
The CPU frequency and CPU usage are the main factors that impact energy consumption (as far as I know). however what is better from an energy-saving perspective to run task minimum energy consumption:
Option 1: Maximum CPU frequency with minimum usage
Option 2: Maximum CPU usage with min frequency.
Work per time scales approximately linearly with CPU frequency. (A bit less than linear because higher CPU frequency means DRAM latency is more clock cycles).
CPU power has two components: switching (dynamic) power which scales with f3 (because voltage has to increase for higher frequency, and transistors switch are pumping that V^2 capacitor energy more often); and leakage power which doesn't vary as dramatically. At high frequency dynamic power dominates, but as you lower the frequency, eventually it becomes significant. The smaller your transistors, the more significant leakage is.
System-wide, there's also other power for things like DRAM that doesn't change much or at all with CPU frequency.
Min frequency is more efficient, unless the minimum is far below the best frequency for work per energy. (Some parts of power decrease with frequency, others like leakage current and DRAM refresh don't).
Frequencies lower than max have lower work per energy (better task efficiency) up to a certain point. Like 800 MHz on a Skylake CPU on Intel's 14 nm process. If there's work to be done, there's no gain from dropping below that; just race-to-sleep at that most efficient frequency. (Power would decrease, but work rate would decrease more below that point.)
https://en.wikichip.org/wiki/File:Intel_Architecture,_Code_Name_Skylake_Deep_Dive-_A_New_Architecture_to_Manage_Power_Performance_and_Energy_Efficiency.pdf is slides from IDF2015 about Skylake power management covered a lot of that general-case stuff well. Unfortunately I don't know where to find a copy of the audio from Efraim Rotem's talk; it was up for a year or so after, but the original link is dead now. :/
Also in general about dynamic power (from switching, not leakage) scaling with frequency cubed if you adjust voltage as well as frequency, see Modern Microprocessors
A 90-Minute Guide! and
https://electronics.stackexchange.com/questions/614018/why-does-switching-cause-power-dissipation
https://electronics.stackexchange.com/questions/258724/why-do-cpus-need-so-much-current
https://electronics.stackexchange.com/questions/548601/why-does-decreasing-the-cmos-supply-voltage-also-decrease-the-maximum-circuit-fr
My inference from the above is, the number of requests has increased which has increased the CPU usage and so the response time also has increased.
Is my inference correct?
How can I make use of the CPU credits?
or is increasing the RAM size the only solution?
I am using cloud.elastic.co
his is a chart of the CPU utilization, and the CPUCredit in my t3.small instance:
According to the documentation, an instance will only use CPU credits when the CPU utilization is above the baseline. If the instance has more than one vCPU, the baseline performance will be shown at the original level.
If I understand correctly, the instance should use CPU credits only when utilization is above 20%. In the chart, it seems like CPU credits are consumed even when the utilization is lower. Why is that?
The graph in your answer shows the average CPU utilization per time period. For the calculation of CPU credit usage however it's relevant how the maximum CPU utilization per minute looks like. Therefore if you change the aggregation method of your CPU utilization from average to maximum you should see a graph which makes more sense.
I have a book statement:
The shorter the memory latency, the smaller the cache block.
I don't understand it. In my current understanding the memory latency is the time required for a data movement. So it seems like a smaller cache block means less amount to be sent, so it's quicker?
The answer gives: A lower miss penalty can enable smaller blocks, since you don't have that much latency to amortize. Which is currently useless statement for me.
That is a simple consequence of the limited speed of light. Signals need time to travel. For a copper wire it is ca. 20cm/ns. If you have a memory chips 10cm away from your CPU you can send a signal with ACK with a rate of 1 GHz (0,5 ns to send the data from CPU to memory and 0,5 ns from memory to CPU for the ACK).
If you put the memory modules nearer to the CPU lets say only 5 cm you can reduce the cache by some margin because you are already two times faster and the benefit of the cache will be less.
What is the disparity between bus throughput and CPU throughput? How does this adversely impact sequential computing? How does this adversely impact parallel computing?
If your CPU can access its cache in 1 nS steps, but your memory takes 60 nS to deliver a random memory word, at some point your processor is going to read memory at 60x slow rate than the cache. If you are processing a lot of data, you may see a tremendous slow down, even for sequential programs.
If you have multiple CPUs, they will collectively have a higher bandwidth demand on the bus. Imagine a serial-access bus with 64 CPUs all trying to read from it: only one succeeds at any one moment. The consequence is it is hard to get parallelism of 64 in such a system, unless each processor stays entirely within its cache.