I've been trying to use perf to profile my running process, but I cannot make sense of some numbers output by perf, here is the command I used and output I got:
$ sudo perf stat -x, -v -e branch-misses,cpu-cycles,cache-misses sleep 1
Using CPUID GenuineIntel-6-55-4
branch-misses: 7751 444665 444665
cpu-cycles: 1212296 444665 444665
cache-misses: 4902 444665 444665
7751,,branch-misses,444665,100.00,,
1212296,,cpu-cycles,444665,100.00,,
4902,,cache-misses,444665,100.00,,
May I know what event does the number "444665" represent?
-x format of perf stat is described in man page of perf-stat, section CSV FORMAT. There is fragment of this man page without optional columns:
CSV FORMAT top
With -x, perf stat is able to output a not-quite-CSV format output
Commas in the output are not put into "". To make it easy to parse it
is recommended to use a different character like -x \;
The fields are in this order:
· counter value
· unit of the counter value or empty
· event name
· run time of counter
· percentage of measurement time the counter was running
Additional metrics may be printed with all earlier fields being
empty.
So, you have value of counter, empty unit of counter, event name, run time, percentage of counter being active (compared to program running time).
By comparing output of these two commands (recommended by Peter Cordes in comment)
perf stat awk 'BEGIN{for(i=0;i<10000000;i++){}}'
perf stat -x \; awk 'BEGIN{for(i=0;i<10000000;i++){}}'
I think than run time is nanoseconds for all time this counter was active. When you run perf stat with non-conflicting set of events, and there are enough hardware counters to count all required events, run time will be almost total time of profiled program being run on CPU. (Example of too large event set: perf stat -x , -e cycles,instructions,branches,branch-misses,cache-misses,cache-references,mem-loads,mem-stores awk 'BEGIN{for(i=0;i<10000000;i++){}}' - run time will be different for these events, because they were dynamically multiplexed during program execution; and sleep 1 will be too short to have multiplexing to activate.)
For sleep 1 there is very small amount of code to be active on CPU, it is just libc startup code and calling syscall nanosleep for 1 second (check strace sleep 1). So in your output 444665 is in ns or is just 444 microseconds or 0.444 milliseconds or 0.000444 seconds of libc startup for sleep 1 process.
If you want to measure whole system activity for one second, try adding -a option of perf stat (profile all processes), optionally with -A to separate events for cpu cores (or with -I 100 to have periodic printing):
perf stat -a sleep 1
perf stat -Aa sleep 1
perf stat -a -x , sleep 1
perf stat -Aa -x , sleep 1
Related
I want to get CPU and RAM usage of a Snakemake pipeline across time.
I run my pipeline on a slurm managed cluster. I know that Snakemake
include benchmarking functions but they only reports pic consumption.
Ideally, I would like to have an output file looking like this :
t CPU RAM
1 103.00 32
2 ... ...
Is there any program to do so?
Thanks!
Don't know any program already doing this, but you can monitor the CPU and MEM usage via native unix commmands, this post gives an answer that could fit your requirements.
Here is a summary of the answer modified for this context:
You can use this bash function
logsnakemake() { while sleep 1; do ps -p $1 -o pcpu= -o pmem= ; done; }
You can tweak the frequency of logging by modifying the value of sleep.
To log your snakemake process with pid=123 just type in the terminal:
$ logsnakemake 123 | tee /tmp/pid.log
I've found Syrupy on github : a ps parser in python with a clear documentation.
Is there a way to determine (in bash) how much time is remaining on a process that is running for a specified time?
For example, some time after executing
caffeinate -s -t 8000 &
is there a command or technique for determining when my system will be allowed to sleep?
Bash won't know that caffeinate has a timer attached to it; for all it knows, -t refers to the number of times you'll place an Amazon order of Red Bull before the process exits five minutes later.
If you know this, however, you can detect when the command was started and do the math yourself.
$ sleep 45 &
[1] 16065
$ ps -o cmd,etime
CMD ELAPSED
sleep 45 00:03
ps -o cmd,etime 00:00
/bin/bash 6-21:11:11
On OS X, this will be ps -o command,etime; see the FreeBSD ps man page or Linux ps docs for details and other switches/options.
I need a shell script that will create a loop to start parallel tasks read in from a file...
Something in the lines of..
#!/bin/bash
mylist=/home/mylist.txt
for i in ('ls $mylist')
do
do something like cp -rp $i /destination &
end
wait
So what I am trying to do is send a bunch of tasks in the background with the "&" for each line in $mylist and wait for them to finish before existing.
However, there may be a lot of lines in there so I want to control how many parallel background processes get started; want to be able to max it at say.. 5? 10?
Any ideas?
Thank you
Your task manager will make it seem like you can run many parallel jobs. How many you can actually run to obtain maximum efficiency depends on your processor. Overall you don't have to worry about starting too many processes because your system will do that for you. If you want to limit them anyway because the number could get absurdly high you could use something like this (provided you execute a cp command every time):
...
while ...; do
jobs=$(pgrep 'cp' | wc -l)
[[ $jobs -gt 50 ]] && (sleep 100 ; continue)
...
done
The number of running cp commands will be stored in the jobs variable and before starting a new iteration it will check if there are too many already. Note that we jump to a new iteration so you'd have to keep track of how many commands you already executed. Alternatively you could use wait.
Edit:
On a side note, you can assign a specific CPU core to a process using taskset, it may come in handy when you have fewer more complex commands.
You are probably looking for something like this using GNU Parallel:
parallel -j10 cp -rp {} /destination :::: /home/mylist.txt
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
What is the difference between unix commands ps & ps -www, from man pages I saw this statement -w Wide output. Use this option twice for unlimited width., but when I use the -www I don't see any difference in output.
-bash-3.2$ ps 18451
PID TTY STAT TIME COMMAND
18451 ? Ds 1:02 ora_xxxx
-bash-3.2$ ps -www 18451
PID TTY STAT TIME COMMAND
18451 ? Ds 1:02 ora_xxxx
wide in the sense that it doesn't truncate output, not in the sense that it will output extra categories. Try it with ps -eF and ps -ewF and you will likely see the difference (depending on terminal size). If you instead want to display full output with more categories do ps -f or ps -F instead (or ps -o to specify which categories you want displayed)
Note that -ww gives you unlimited width, adding more ws does not make it longer.
One of my favourites is ps -efHww. It shows all processes as a hierarchy along with start time and consumed CPU time, with complete command lines.
Yes, there is a difference. You do not see any in your example, since the output is not wide enough. But if you have an output that exceeds a certain width (terminal width), then you will see a difference: the columns are not chopped any more. The result is that the output is wrapped over the line endings.
I try to limit the allowed of a sub process in a ksh script. I try using ulimit (hard or soft values) but the sub process always break the limit (if take longer than allowed time).
# value for a test
Sc_Timeout=2
Sc_FileOutRun=MyScript.log.Running
Sc_Cmd=./AScriptToRunInSubShell.sh
(
ulimit -Ht ${Sc_Timeout}
ulimit -St ${Sc_Timeout}
time (
${Sc_Cmd} >> ${Sc_FileOutRun} 2>&1
) >> ${Sc_FileOutRun} 2>&1
# some other command not relevant for this
)
result:
1> ./MyScript.log.Running
ulimit -Ht 2
ulimit -St 2
1>> ./MyScript.log.Running 2>& 1
real 0m11.45s
user 0m3.33s
sys 0m4.12s
I expect a timeout error with a sys or user time of something like 0m2.00s
When i make a test directly from command line, the ulimit Hard seems to effectively limit the time bu not in script
System of test/dev is a AIX 6.1 but should also work other version and on sun and linux
Each process has its own time limits, but time shows the cumulative time for the script. Each time you create a child process, that child will have its own limits. So, for example, if you call cut and grep in the script then those processes use their own CPU time, the quota is not decremented from the script's, although the limits themselves are inherited.
If you want a time limit, you might wish to investigate trap ALRM.