Related
I've been trying to use perf to profile my running process, but I cannot make sense of some numbers output by perf, here is the command I used and output I got:
$ sudo perf stat -x, -v -e branch-misses,cpu-cycles,cache-misses sleep 1
Using CPUID GenuineIntel-6-55-4
branch-misses: 7751 444665 444665
cpu-cycles: 1212296 444665 444665
cache-misses: 4902 444665 444665
7751,,branch-misses,444665,100.00,,
1212296,,cpu-cycles,444665,100.00,,
4902,,cache-misses,444665,100.00,,
May I know what event does the number "444665" represent?
-x format of perf stat is described in man page of perf-stat, section CSV FORMAT. There is fragment of this man page without optional columns:
CSV FORMAT top
With -x, perf stat is able to output a not-quite-CSV format output
Commas in the output are not put into "". To make it easy to parse it
is recommended to use a different character like -x \;
The fields are in this order:
· counter value
· unit of the counter value or empty
· event name
· run time of counter
· percentage of measurement time the counter was running
Additional metrics may be printed with all earlier fields being
empty.
So, you have value of counter, empty unit of counter, event name, run time, percentage of counter being active (compared to program running time).
By comparing output of these two commands (recommended by Peter Cordes in comment)
perf stat awk 'BEGIN{for(i=0;i<10000000;i++){}}'
perf stat -x \; awk 'BEGIN{for(i=0;i<10000000;i++){}}'
I think than run time is nanoseconds for all time this counter was active. When you run perf stat with non-conflicting set of events, and there are enough hardware counters to count all required events, run time will be almost total time of profiled program being run on CPU. (Example of too large event set: perf stat -x , -e cycles,instructions,branches,branch-misses,cache-misses,cache-references,mem-loads,mem-stores awk 'BEGIN{for(i=0;i<10000000;i++){}}' - run time will be different for these events, because they were dynamically multiplexed during program execution; and sleep 1 will be too short to have multiplexing to activate.)
For sleep 1 there is very small amount of code to be active on CPU, it is just libc startup code and calling syscall nanosleep for 1 second (check strace sleep 1). So in your output 444665 is in ns or is just 444 microseconds or 0.444 milliseconds or 0.000444 seconds of libc startup for sleep 1 process.
If you want to measure whole system activity for one second, try adding -a option of perf stat (profile all processes), optionally with -A to separate events for cpu cores (or with -I 100 to have periodic printing):
perf stat -a sleep 1
perf stat -Aa sleep 1
perf stat -a -x , sleep 1
perf stat -Aa -x , sleep 1
Gdb, like any other program, isn't perfect, and every now and then I encounter bugs that render the current Gdb instance unusable. At this point, if I have a debugging session with a lot of valuable state in the inferior, I'd like to be able to just start a new Gdb session on it. That is, detach, quit Gdb and start a new Gdb instance to restart where I left off.
However, when detaching Gdb, it resumes the inferior so that it continues running where it was, which ruins the point of the whole exercise. Therefore, I'm wondering if it's possible to detach in such a state that the inferior is as if it had been sent a SIGSTOP, basically.
I've tried simply killing Gdb, but interestingly, that seems to take the inferior with it. Not sure how that works.
when detaching Gdb, it resumes the inferior
GDB doesn't, the kernel does (assuming Linux).
I've tried simply killing Gdb, but interestingly, that seems to take the inferior with it
The kernel sends it SIGHUP, which normally kills the inferior. You can prevent that with either SIG_IGN in the inferior, or simply (gdb) call signal(1, 1).
After that, you can detach and quit GDB, but the kernel will resume the inferior with SIGCONT (see Update below), so you are back to square one.
However, there is a solution. Consider the following program:
int main()
{
while (1) {
printf("."); fflush(0); sleep(1);
}
}
gdb -q ./a.out
(gdb) run
Starting program: /tmp/a.out
.....^C
Program received signal SIGINT, Interrupt.
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
We want the program to not run away on detach, so we send it SIGSTOP:
(gdb) signal SIGSTOP
Continuing with signal SIGSTOP.
Program received signal SIGSTOP, Stopped (signal).
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 in ../sysdeps/unix/syscall-template.S
(gdb) detach
Detaching from program: /tmp/a.out, process 25382
Note that at this point, gdb is detached (but still alive), and the program is not running (stopped).
Now in a different terminal:
gdb -q -ex 'set prompt (gdb2) ' -p 25382
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb2) c
Continuing.
Program received signal SIGSTOP, Stopped (signal).
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 in ../sysdeps/unix/syscall-template.S
(gdb2) sig 0
Continuing with no signal.
The program continues running, printing dots in the first terminal.
Update:
SIGHUP -- Interesting. By what mechanism, though?
Good question. I didn't know, but this appears to be the answer:
From setpgid man page:
If the exit of the process causes a process group to become orphaned,
and if any member of the newly orphaned process group is stopped,
then a SIGHUP signal followed by a SIGCONT signal will be sent to
each process in the newly orphaned process group.
I have verified that if I detach and quit GDB without stopping the inferior, it doesn't get SIGHUP and continues running without dying.
If I do send it SIGSTOP and arrange for SIGHUP to be ignored, then I see both SIGHUP and SIGCONT being sent in strace, so that matches the man page exactly:
(gdb) detach
Detaching from program: /tmp/a.out, process 41699
In another window: strace -p 41699. Back to GDB:
(gdb) quit
strace output:
--- stopped by SIGSTOP ---
--- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_KERNEL} ---
restart_syscall(<... resuming interrupted call ...>) = 0
write(1, ".", 1.) = 1
...
I'm trying to call a batch file using START so I can control the processor affinity of the single compile command inside it.
ATTEMPT #1
START "" /NODE 1 /AFFINITY 0x1 build_one_qcc.bat
But I get the error message
The system cannot accept the START command parameter 1
ATTEMPT #2
START "" build_one_qcc.bat
And that launched a new cmd window but within that window I got the same error message.
ATTEMPT #3
I copy-pasted to my command window the contents of the batch file plus the START command:
start "" /node 1 /affinity 0x1 "qcc -Vgcc_ntoarmv7le ... "
Still got same error
What am I doing wrong?
start /node 0 notepad.exe works fine.
start /node 1 notepad.exe works fine on a system with two physical processors.
So what you were "doing wrong" was to run it on a single processor computer ;)
Each (physical) processor has it's "own" DIMM-slots (which doesn't mean, it has no access to the "other" memory - it's just a question of performance). You just can't assign memory that isn't there.
Obviously the 1 of /node 1 can't be processed.
It seems that the documentation for the start command is wrong for [/NODE <NUMA-Node>] or at least the format of a NUMA-Node isn't a number.
Why not use it without it?
START "" /AFFINITY 0x1 build_one_qcc.bat
What i'm trying to do is replicating cat /proc/{pid}/fd/{fd-id} in OSX.
I get the idea that it can be done using gdb.
the process is.
Suppose i want to read from fd 34 of pid 4554.
run gdb and attach to pid 4554.
open a file using 'w' mode. (to write the data we read of {fd-id}. Suppose fd of this file is 65.
fseek fd 34 to 0.
Start a loop.
read some data from fd 34, save it to memory.
write the buffered data from memory to fd 65.
keep running this loop until EOF of fd 34 is reached.
Close fd 65.
detach from pid 4554 and close gdb.
Now i dont really know much gdb so can anyone say how to make the above steps in gdb command ?
I am trying to run a app using gdb in emulator shell. I use following command
gdb <path of exe>
However, The app does not launch and I get following error
Starting program: <path of exe>
[Thread debugging using libthread_db enabled]
Program exited normally.
However, when I attach a running process to gdb, it works fine.
gdb -pid <process_id>
What could be the reason?
****<Update>****
On Employed Russian's advice, I did these steps
(gdb) b _start
Breakpoint 1 at 0xb40
(gdb) b main
Breakpoint 2 at 0xc43
(gdb) catch syscall exit
Catchpoint 3 (syscall 'exit' [1])
(gdb) catch syscall exit_group
Catchpoint 4 (syscall 'exit_group' [252])
(gdb) r
Starting program: <Exe Path>
[Thread debugging using libthread_db enabled]
Breakpoint 1, 0x80000b40 in _start ()
(gdb) c
Continuing.
Breakpoint 2, 0x80000c43 in main ()
(gdb) c
Continuing.
Catchpoint 4 (call to syscall 'exit_group'), 0xb7fe1424 in __kernel_vsyscall
()
(gdb) c
Continuing.
Program exited normally.
(gdb)
What does Catchpoint 4 (call to syscall 'exit_group'), 0xb7fe1424 in __kernel_vsyscall
this mean?
I probed further and i found this
Single stepping until exit from function main,
which has no line number information.
__libc_start_main (main=0xb6deb030 <main>, argc=1, ubp_av=0xbffffce4,
init=0x80037ab0 <__libc_csu_init>, fini=0x80037b10 <__libc_csu_fini>,
rtld_fini=0xb7ff1000 <_dl_fini>, stack_end=0xbffffcdc) at libc-start.c:258
258 libc-start.c: No such file or directory.
in libc-start.c
However, libc.so is present and i have exported its path also using
export LD_LIBRARY=$LD_LIBRARY:/lib
Why is not loading?
The app does not launch and I get following error
You are mistaken: the app does launch (and the output you get is not an error), and then immediately exits with 0 exit status.
Therefore, you should look at the problem with the app, not a problem with GDB. One way to look at the problem is to set a breakpoint on _start and main, and check whether either of these functions is reached.
If they are, using catch syscall exit or catch syscall exit_group may give you a clue for why the application exits.
Perhaps your application employs anti-reverse-engineering techniques, and detects that it is being debugged?
Update:
you've verified that the application in fact starts, reaches main, and then calls exit. Now all you have to do is figure out why it calls exit. The way to do that is to find out where exit_group system call is coming from.
And to do that, you should get to that system call (Catchpoint 4), issue GDB where command. That will tell you how your application decides to exit.
You also (apparently) built your application without debugging info (usually -g flag). You'll make your debugging easier if you build a debug version of the application.