`finish` command was not executed when a breakpoint was hit - debugging

Here is part of my gdb script
break __libc_malloc if $x0==1234
command 1
finish
printf "malloc() = 0x%x\n\n",$x0
continue
end
The breakpoint was triggered. But gdb stopped without 'finish' executed.
If the 'finish' command was deleted, the breakpoint was trigger with both 'print' and 'continue' excuted.
So what was wrong?

Related

Call to ls within trapped SIGUSR1 signal results in undesired shell logout

I am attempting to troubleshoot an example regarding bash script signal traps given during a lecture. I copy-pasted the code into a script, but my execution of it results in an undesired and unexpected logout at the end. The instructor makes no mention that this might happen, and I am under the impression that it shouldn't. However, after failing to make contact with my TAs and fellow students, I turn to you for help!
The full slide I am reviewing can be found on page 15 here: http://web.engr.oregonstate.edu/~brewsteb/CS344Slides/3.3%20Signals.pdf
Expected Output:
Triggering a child process termination with a silent ls SIGCHLD Received!
Exiting! [1]+ Done sigchldtest
Actual Output:
Triggering a child process termination with a silent ls
os1 ~/344_operating_systems/examples 1062$ SIGCHLD Received! Exiting!
logout
---------------
The script, sigchldtest:
#!/bin/bash
set -m trap "echo 'Triggering a child process termination with a silent ls'; ls > /dev/null" USR1
trap "echo 'SIGCHLD Received! Exiting!'; exit 0" CHLD
while [ 1 -eq 1 ]
do
echo "nothing" > /dev/null
done
Console input:
sigchldtest &
kill -SIGUSR1 [sigchldtest pID]
Steps Taken
Removed "ls >/dev/null" from line 3. Code executes but does not trigger the CHLD signal trap, as expected.
Removed "exit 0" from line 4. Shell does not log out upon completion, but user must press "enter" to return to the command prompt and the script continues running in the background (not desired). Additionally, the path name should not be printed before "SIGCHLD Received!". It appears the shell prompt is being activated during the middle of my script?
Triggering a child process termination with a silent ls
os1 ~/344_operating_systems/examples 1010$ SIGCHLD Received! Exiting!
{USER MUST PRESS ENTER HERE}
$
Attempted process in both PuTTY and Windows 10 Console, same result.
Attempted to kill the process in another terminal per Ferrybig. Killing the process in the second terminal logged me out of the first one, lol!
Thank you for your time!

Detaching Gdb without resuming the inferior

Gdb, like any other program, isn't perfect, and every now and then I encounter bugs that render the current Gdb instance unusable. At this point, if I have a debugging session with a lot of valuable state in the inferior, I'd like to be able to just start a new Gdb session on it. That is, detach, quit Gdb and start a new Gdb instance to restart where I left off.
However, when detaching Gdb, it resumes the inferior so that it continues running where it was, which ruins the point of the whole exercise. Therefore, I'm wondering if it's possible to detach in such a state that the inferior is as if it had been sent a SIGSTOP, basically.
I've tried simply killing Gdb, but interestingly, that seems to take the inferior with it. Not sure how that works.
when detaching Gdb, it resumes the inferior
GDB doesn't, the kernel does (assuming Linux).
I've tried simply killing Gdb, but interestingly, that seems to take the inferior with it
The kernel sends it SIGHUP, which normally kills the inferior. You can prevent that with either SIG_IGN in the inferior, or simply (gdb) call signal(1, 1).
After that, you can detach and quit GDB, but the kernel will resume the inferior with SIGCONT (see Update below), so you are back to square one.
However, there is a solution. Consider the following program:
int main()
{
while (1) {
printf("."); fflush(0); sleep(1);
}
}
gdb -q ./a.out
(gdb) run
Starting program: /tmp/a.out
.....^C
Program received signal SIGINT, Interrupt.
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
We want the program to not run away on detach, so we send it SIGSTOP:
(gdb) signal SIGSTOP
Continuing with signal SIGSTOP.
Program received signal SIGSTOP, Stopped (signal).
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 in ../sysdeps/unix/syscall-template.S
(gdb) detach
Detaching from program: /tmp/a.out, process 25382
Note that at this point, gdb is detached (but still alive), and the program is not running (stopped).
Now in a different terminal:
gdb -q -ex 'set prompt (gdb2) ' -p 25382
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb2) c
Continuing.
Program received signal SIGSTOP, Stopped (signal).
0x00007ffff7ad5de0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 in ../sysdeps/unix/syscall-template.S
(gdb2) sig 0
Continuing with no signal.
The program continues running, printing dots in the first terminal.
Update:
SIGHUP -- Interesting. By what mechanism, though?
Good question. I didn't know, but this appears to be the answer:
From setpgid man page:
If the exit of the process causes a process group to become orphaned,
and if any member of the newly orphaned process group is stopped,
then a SIGHUP signal followed by a SIGCONT signal will be sent to
each process in the newly orphaned process group.
I have verified that if I detach and quit GDB without stopping the inferior, it doesn't get SIGHUP and continues running without dying.
If I do send it SIGSTOP and arrange for SIGHUP to be ignored, then I see both SIGHUP and SIGCONT being sent in strace, so that matches the man page exactly:
(gdb) detach
Detaching from program: /tmp/a.out, process 41699
In another window: strace -p 41699. Back to GDB:
(gdb) quit
strace output:
--- stopped by SIGSTOP ---
--- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_KERNEL} ---
restart_syscall(<... resuming interrupted call ...>) = 0
write(1, ".", 1.) = 1
...

How to re-run program in GDB several times?

I have a program which fails sporadically, but with the same error. To debug it I'd like to run it under GDB until it fails, set breakpoints and re-run it.
what do I do:
gdb --args /path/to/program <program args>
But I can't find anywhere how do I tell GDB "run this program 100 times" for example.
The simplest solution I can think of is to run program in infinite while loop until it fails or you press Ctrl+C to break the loop.
(gdb) while 1
>run
>end
This gdb script will run the program 100 times, or until it receives a signal. $_siginfo is non-void if the program is stopped due to a signal, and is void if the program exited. As far as I can tell, any stop of the process, including breakpoints, watchpoints, and single-stepping, will set $_siginfo to something.
set $n = 100
while $n-- > 0
printf "starting program\n"
run
if $_siginfo
printf "Received signal %d, stopping\n", $_siginfo.si_signo
loop_break
else
printf "program exited\n"
end
end

LLDB Restart process without user input

I am trying to debug a concurrent program in LLDB and am getting a seg fault, but not on every execution. I would like to run my process over and over until it hits a seg fault. So far, I have the following:
b exit
breakpoint com add 1
Enter your debugger command(s). Type 'DONE' to end.
> run
> DONE
The part that I find annoying, is that when I get to the exit function and hit my breakpoint, when the run command gets executed, I get the following prompt from LLDB:
There is a running process, kill it and restart?: [Y/n]
I would like to automatically restart the process, without having to manually enter Y each time. Anyone know how to do this?
You could kill the previous instance by hand with kill - which doesn't prompt - then the run command won't prompt either.
Or:
(lldb) settings set auto-confirm 1
will give the default (capitalized) answer to all lldb queries.
Or if you have Xcode 6.x (or current TOT svn lldb) you could use the lldb driver's batch mode:
$ lldb --help
...
-b
--batch
Tells the debugger to running the commands from -s, -S, -o & -O,
and then quit. However if any run command stopped due to a signal
or crash, the debugger will return to the interactive prompt at the
place of the crash.
So for instance, you could script this in the shell, running:
lldb -b -o run
in a loop, and this will stop if the run ends in a crash rather than a normal exit. In some circumstances this might be easier to do.

Preventing debugging session from pausing after each inferior exits

I'm debugging a tree of processes using gdb's very handy multiple-inferior support:
(gdb) set detach-on-fork off
(gdb) set schedule-multiple on
(gdb) set follow-fork-mode parent
(gdb) break PostgresMain
(gdb) break PostmasterMain
and now need to let things run until I hit one of the future breakpoints in some yet to be spawned inferior.
However, gdb seems to be "helpfully" pausing whenever an inferior exits normally, or at least blocking cleanup of the inferior so that its parent's wait() can return:
(gdb) c
[New process 16505]
process 16505 is executing new program: /home/craig/pg/bdr/bin/pg_config
Reading symbols from /home/craig/pg/bdr/bin/pg_config...done.
[Inferior 2 (process 16505) exited normally]
(gdb) info inferior
Num Description Executable
* 2 <null> /home/craig/pg/bdr/bin/pg_config
1 process 16501 /usr/bin/make
(gdb) inferior 1
[Switching to inferior 1 [process 16501] (/usr/bin/make)]
[Switching to thread 1 (process 16501)]
#0 0x0000003bc68bc502 in __libc_wait (stat_loc=0x7fffffffbc78) at ../sysdeps/unix/sysv/linux/wait.c:30
30 return INLINE_SYSCALL (wait4, 4, WAIT_ANY, stat_loc, 0,
(gdb)
so I have to endlessly:
(gdb) inferior 1
(gdb) c
to carry on. About 70 times, before I hit the desired breakpoint in a child of a child of a child.
I think what's happening is that gdb treats process exit as a stop event, and since non-stop is set to off (the default) it stops all threads in all inferiors when one thread stops. However, this inferior has terminated, it isn't a normal stop event, so you can't just cont it, you have to switch to another process first.
Is there some way to stop gdb pausing at each inferior exit? I would've expected follow-fork-mode parent with schedule-multiple on to do the trick, but gdb seems to still want to stop when an inferior exits.
I guess I'm looking for something like a "skip proc-exit", or a virtual signal I can change the handler policy on so it doesn't stop.
set non-stop on seems like it should be the right answer, but I suspect it's broken for multiple inferiors.
If I use non-stop on, then after the first exit trap, gdb's internal state indicates that inferior 1 is running:
(gdb) info inferior
Num Description Executable
* 1 process 20540 /usr/bin/make
(gdb) info thread
Id Target Id Frame
* 1 process 20540 "make" (running)
(gdb) cont
Continuing.
Cannot execute this command while the selected thread is running.
but the kernel sees it as blocked on ptrace_stop:
$ ps -o "cmd,wchan" -p 20540
CMD WCHAN
/usr/bin/make check ptrace_stop
... and it makes no progress until gdb is detached, or it's killed. Signals to the process are ignored, and interrupt in gdb has no effect.
I'm using GNU gdb (GDB) Fedora 7.7.1-18.fc20 on x86_64.
After stumbling on a post that references it in passing I found that the missing magic is set target-async on alongside set non-stop on.
non-stop mode, as expected, means gdb won't stop everything whenever an inferior exits. target-async is required to make it actually work correctly on gdb 7.7; it's the default on 7.8.
So the full incantation is:
set detach-on-fork off
set schedule-multiple on
set follow-fork-mode parent
set non-stop on
set target-async on
For 7.8, remove target-async on and, to reduce noise, add set print symbol-loading off.
The following Python extension to gdb will switch back to the first inferior and resume execution after each stop.
It feels like a total hack, but it works. When a process exits it sets a flag indicating that it stopped on an exit, then switches to the original process. gdb will then stop execution, delivering a stop event. We check to see if the stop was caused by our stop event and if so, we immediately continue.
The code also sets up the breakpoints I'm using and the multi-process settings, so I can just source thescript.py and run .
gdb.execute("set python print-stack full")
gdb.execute("set detach-on-fork off")
gdb.execute("set schedule-multiple on")
gdb.execute("set follow-fork-mode parent")
gdb.execute("set breakpoint pending on")
gdb.execute("break PostgresMain")
gdb.execute("break PostmasterMain")
gdb.execute("set breakpoint pending off")
def do_continue():
gdb.execute("continue")
def exit_handler(event):
global my_stop_request
has_threads = [ inferior.num for inferior in gdb.inferiors() if inferior.threads() ]
if has_threads:
has_threads.sort()
gdb.execute("inferior %d" % has_threads[0])
my_stop_request = True
gdb.events.exited.connect(exit_handler)
def stop_handler(event):
global my_stop_request
if isinstance(event, gdb.SignalEvent):
pass
elif isinstance(event, gdb.BreakpointEvent):
pass
elif my_stop_request:
my_stop_request = False
gdb.post_event(do_continue)
gdb.events.stop.connect(stop_handler)
There must be an easier way than this. It's ugly.

Resources