How to interpret cpu profiling graph - go

I was following the go blog here
I tried to profile my program but it looks a bit different. (Seems that go has moved from sampling to instrumentation?)
I wonder what these numbers mean
Especially showing nodes accounting for 2.59s, 92.5% of 2.8
What does total sample = 2.8s mean? The sample is drawn in an interval of 2.8 seconds?
Does it mean that only nodes that are running over 92.5% of sample
time are shown?
Also I wonder these numbers are generated. In the original go blog, the measure is how many times the function is detected in execution among all samples. However, we are dealing with seconds here. How does go profiling tool know how many seconds a function call takes.
Any help will be appreciated

Think of the graph as a graph of a resource, time. You'll start at the top with, for example, 10 seconds. Then you'll see that 5 seconds went to time.Sleep and 5 went to encoding/json. The particular divides in that time is represented by the arrows, so they show that 5 went to each part of the program. So now we have 3 nodes, the first node 10 seconds, time.Sleep 5 seconds, and encoding/json 5 seconds. Then those 5 seconds in encoding/json are broken down even further into the functions that took up most the time. The 0.01s (percentage) out of 0.02s (larger percentage) means that this function took 0.01s of processing time out of a total of 0.02s of the block of time (the arrow with the number) total by this particular call stack. The percentage represents the overall percentage of execution time this part took up from the whole pie. So you'll see that encoding/json string/encoder took 0.36 percent of the total execution time/resources of your program.

Related

Fluctuations in execution time, is this normal?

I'm trying to implement some sort of template matching which requires me calling a function more than 10 thousand times per frame. I've managed to reduce the execution time of my function to a few microseconds. However, about 1 in 5 executions will take quite longer to run. While the function usually runs in less than 20 microseconds these cases can even take 100 microseconds.
Trying to find the part of the function that has fluctuating execution time, I realized that big fluctuations appear in many parts, almost randomly. And this "ghost" time is added even in parts that take constant time. For example, iterating through a specific number of vectors and taking their dot product with a specific vector fluctuates from 3 microseconds to 20+.
All the tests I did seem to indicate that the fluctuation has nothing to do with the varying data but instead it's just random at some parts of the code. Of course I could be wrong and maybe all these parts that have fluctuations contain something that causes them. But my main question is specific and that's why I don't provide a snippet or runtime data:
Are fluctuations of execution time from 3 microseconds to 20+ microseconds for constant time functions with the same amount of data normal? Could the cpu occasionally be doing something else that is causing these ghost times?

Calculating CPU Utiliization for Round Robin Algorithm

I have been stuck on this for the past day. Im not sure how to calculate cpu utilization percentage for processes using round robin algorithm.
Let say we have these datas with time quantum of 1. Job Letter followed by arrival and burst time. How would i go about calculating the cpu utilization? I believe the formula is
total burst time / (total burst time + idle time). I know idle time means when the cpu are not busy but not sure how to really calculate it the processes. If anyone can walk me through it, it is greatly appreciated
A 2 6
B 3 1
C 5 9
D 6 7
E 7 10
Well,The formula is correct but in order to know the total-time you need to know the idle-time of CPU and you know when your CPU becomes idle? During the context-swtich it becomes idlt and it depends on short-term-scheduler how much time it take to assign the next proccess to CPU.
In 10-100 milliseconds of time quantua , context swtich time is arround 10 microseconds which is very small factor , now you can guess the context-switch time with time quantum of 1 millisecond. It will be ignoreable but it also results in too many context-switches.

Golang - What is the meaning of the seconds in CPU profiling graph?

For example, the data in the figure runtime.scanobject:
13.42s
runtime.scanobject 9.69s(4.51%) of 18.30s(8.52%).
5.33s
what is the meaning of the seconds and percent?
Thanks.
When CPU profiling is enabled, the Go program stops about 100 times per second and records a sample consisting of the program counters on the currently executing goroutine's stack.
That time and percentange is in reference to the sample.
Here is a nice reference for you to read more about it: https://blog.golang.org/profiling-go-programs

inherent parallelism for a program

Hi I have a question regarding inherent parallelism.
Let's say we have a sequential program which takes 20 seconds to complete execution. Suppose the execution time consists of 2 seconds of setup time at the beginning and 2 seconds of finalization time at the end of the execution, and the remaining work can be parallelized. How do we calculate the inherent parallelism of this program?
How do you define "inherent parallelism"? I've not heard the term. We can talk about "possible speedup".
OP said "remaining work can be parallelized"... to what degree?
Can it run with infinite parallelism? If this were possible (it isn't practical), then the total runtime would be 4 seconds with a speedup of 20/4 --> 5.
If the remaining work can be run on N processors perfectly in parallel,
then the total runtime would be 4+16/N. The ratio of that to 20 seconds is 20/(4+16/N) which can have pretty much any degree of speedup from 1 (no speedup) to 5 (he the limit case) depending on the value of N.

Understanding the Instruments Time Profiler column headings

I'm running an application through the profiler with a sampling rate of 1 ms, but I'm having trouble understanding what the column headers mean. The documentation seems to be lacking the definitions for most of the columns headings, though I managed to decipher Self, # Self and Self % from the answer here. This is what I have so far:
Total Samples: The total number of (1 ms) samples where the program was in a given function
Total Time: The total time spent in a function (corresponds to total samples using a 1ms sampling rate)
Self: Explained in the linked question, but how does it differ from total time? I should be able to figure out the meaning of # Self and Self % from this.
Total %: Total samples as a percentage of the total running time
The rest of the column headings seem to combinations of the above (perhaps due to the 1ms sampling rate) or are self-explanatory. For example, I have a function that takes 647621ms of total time (89.4%), but has a Self/# Self of 9. Does that mean the function is called often, but takes little time to execute? On the other hand, another function takes 15559ms of total time (2.1%) but Self/# Self is 13099, which would mean that it is called less often, but takes much longer to complete. Am I on the right track?
Recent versions of Instruments don't have a Total Samples column, but I'll explain the difference between total samples, total time, and self because it explains how the Time Profiler instrument works. The Time Profiler instrument records the call stack periodically, every millisecond by default. The Total Samples column tells you the number of samples a method was in the call stack. The Total Time column tells you the amount of time the method was in the call stack. The Self column tells you the number of samples a method was at the top of the call stack, which means your app was in the method when Instruments recorded the sample.
The Self column is much more important than the Total Samples and Total Time columns. Your main() function is going to have a high total samples count and high total time because main() is in the call stack the entire time your application is running. But spending time optimizing the main() function in a Cocoa/Cocoa Touch application is a waste of time because all main() does is launch the application. Focus on methods that have a high Self value.
Recent versions of Instruments have a Running Time column. Each listing in the column has two values: a time and a percentage. The time corresponds to the Total Time in your question. The percentage corresponds to the Total % in your question.
UPDATE
Let me answer the question on your function examples in the last paragraph. Your application isn't spending much time inside the function in your first example (I'll call it Function A) because its Self entry is only 9. That means there were only 9 samples where your application was inside Function A. The high total time means your application spent a lot of time inside functions that Function A calls. Function A is in the call stack often, but not at the top of the call stack often.
In your second example the application is spending more time in the function, Function B, because its Self entry is 13099. The application does not spend a lot of time in functions that Function B calls because the total time is much smaller. Function B is at the top of the call stack more often than A and in the call stack less often than A. If you had a performance issue in your application, Function B would be the function to examine for ways to improve its performance. Optimizing Function A would not help much because its Self entry is only 9.

Resources