I have a program which fails sporadically, but with the same error. To debug it I'd like to run it under GDB until it fails, set breakpoints and re-run it.
what do I do:
gdb --args /path/to/program <program args>
But I can't find anywhere how do I tell GDB "run this program 100 times" for example.
The simplest solution I can think of is to run program in infinite while loop until it fails or you press Ctrl+C to break the loop.
(gdb) while 1
>run
>end
This gdb script will run the program 100 times, or until it receives a signal. $_siginfo is non-void if the program is stopped due to a signal, and is void if the program exited. As far as I can tell, any stop of the process, including breakpoints, watchpoints, and single-stepping, will set $_siginfo to something.
set $n = 100
while $n-- > 0
printf "starting program\n"
run
if $_siginfo
printf "Received signal %d, stopping\n", $_siginfo.si_signo
loop_break
else
printf "program exited\n"
end
end
Related
I have a bash script, which start 2 processes:
openocd ...flags... 2>openocd.log &
arm-none-eabi-gdb
When in gdb, interrupting execution with Ctrl+C causes openocd to receive SIGINT as well and, thus it stops. I've tried to trap and reissue SIGINT directly to gdb with:
trap 'kill 2 $!' INT
But apart from requiring root, it does not work anyway:
./dbg.sh: 1: kill: No such process
Are there elegant ways to perform the task?
__
Well, running script with debug options on helped a lot. But still I encounter weird behavior. Here is the content of my script:
#!/bin/sh
set -vx
trap 'killall -s2 arm-none-eabi-gdb-py' 2
openocd -f ...flags... 2>openocd.log & arm-none-eabi-gdb-py
When I run killall -s2 arm-none-eabi-gdb-py from within different tty - it terminates execution of remote target and does not close openocd, but sending SIGINT through Ctrl+C returns:
+ killall -s2 arm-none-eabi-gdb-py
arm-none-eabi-gdb-py: no process found
Seems like trap does not inhibit signals at all... changing to trap 'ps -ef' INT reveals, openocd AND gdb are already down when the trap command executes.
Isn't there a missing '&' in your instruction (it would give that)?
openocd -f ...flags... 2>openocd.log &**&** arm-none-eabi-gdb-py
A legacy program most likely gets into an infinite loop on certain pathological inputs. I have >1000 such instances, however, I suspect that the vast majority of them trigger the same bug. Therefore, I would like to reduce the >1000 instances to the fundamentally different ones. The first step is to pause the application after, say, 10 seconds and collect the backtrace.
If I run:
gdb --batch --command=backtrace.txt --args ./legacy_program
with backtrace.txt
run
bt
and I hit Ctrl + C after 10 seconds in the same terminal I get exactly the backtrace I want.
Now, I would like to do that automatically. I have tried sending SIGINT (the expected equivalent of Ctrl + C) from another terminal but I do not get the backtrace anymore. Here are some of my failed attempts based on
GDB how to stop execution without a breakpoint?
Neither of these have any effect:
pkill -SIGINT gdb
kill -SIGINT 5717
where 5717 is the pid of the only gdb running. Sending SIGINT to the legacy_program the same way does kill it but then I do not get the backtrace:
Program received signal SIGINT, Interrupt.
Quit
How can I programmatically pause the execution of the legacy_program after 10 seconds and get a backtrace?
This post was motivated by my frustration not being able to find an answer to this question here at StackOverflow.
Also note that
[it is not merely OK to ask and answer your own question, it is explicitly encouraged.](https://blog.stackoverflow.com/2011/07/its-ok-to-ask-and-answer-your-own-questions/)
Apparently, it is a known (bug) feature in gdb, see
GDB is not trapping SIGINT. Ctrl+C terminates program when should break gdb. Try sending SIGSTOP instead from the other terminal:
pkill -STOP legacy_program
It works on my machine.
Note that you do not have to run the legacy_program in the debugger. Enable core dumps
ulimit -c unlimited
and send the program SIGTRAP to make it crash, then get the backtrace from the core dump. So, start the program:
./legacy_program
From another terminal:
pkill -TRAP legacy_program
The backtrace can be obtained like this:
gdb --batch -ex=bt ./legacy_program core
I am trying to debug a concurrent program in LLDB and am getting a seg fault, but not on every execution. I would like to run my process over and over until it hits a seg fault. So far, I have the following:
b exit
breakpoint com add 1
Enter your debugger command(s). Type 'DONE' to end.
> run
> DONE
The part that I find annoying, is that when I get to the exit function and hit my breakpoint, when the run command gets executed, I get the following prompt from LLDB:
There is a running process, kill it and restart?: [Y/n]
I would like to automatically restart the process, without having to manually enter Y each time. Anyone know how to do this?
You could kill the previous instance by hand with kill - which doesn't prompt - then the run command won't prompt either.
Or:
(lldb) settings set auto-confirm 1
will give the default (capitalized) answer to all lldb queries.
Or if you have Xcode 6.x (or current TOT svn lldb) you could use the lldb driver's batch mode:
$ lldb --help
...
-b
--batch
Tells the debugger to running the commands from -s, -S, -o & -O,
and then quit. However if any run command stopped due to a signal
or crash, the debugger will return to the interactive prompt at the
place of the crash.
So for instance, you could script this in the shell, running:
lldb -b -o run
in a loop, and this will stop if the run ends in a crash rather than a normal exit. In some circumstances this might be easier to do.
I'm debugging a tree of processes using gdb's very handy multiple-inferior support:
(gdb) set detach-on-fork off
(gdb) set schedule-multiple on
(gdb) set follow-fork-mode parent
(gdb) break PostgresMain
(gdb) break PostmasterMain
and now need to let things run until I hit one of the future breakpoints in some yet to be spawned inferior.
However, gdb seems to be "helpfully" pausing whenever an inferior exits normally, or at least blocking cleanup of the inferior so that its parent's wait() can return:
(gdb) c
[New process 16505]
process 16505 is executing new program: /home/craig/pg/bdr/bin/pg_config
Reading symbols from /home/craig/pg/bdr/bin/pg_config...done.
[Inferior 2 (process 16505) exited normally]
(gdb) info inferior
Num Description Executable
* 2 <null> /home/craig/pg/bdr/bin/pg_config
1 process 16501 /usr/bin/make
(gdb) inferior 1
[Switching to inferior 1 [process 16501] (/usr/bin/make)]
[Switching to thread 1 (process 16501)]
#0 0x0000003bc68bc502 in __libc_wait (stat_loc=0x7fffffffbc78) at ../sysdeps/unix/sysv/linux/wait.c:30
30 return INLINE_SYSCALL (wait4, 4, WAIT_ANY, stat_loc, 0,
(gdb)
so I have to endlessly:
(gdb) inferior 1
(gdb) c
to carry on. About 70 times, before I hit the desired breakpoint in a child of a child of a child.
I think what's happening is that gdb treats process exit as a stop event, and since non-stop is set to off (the default) it stops all threads in all inferiors when one thread stops. However, this inferior has terminated, it isn't a normal stop event, so you can't just cont it, you have to switch to another process first.
Is there some way to stop gdb pausing at each inferior exit? I would've expected follow-fork-mode parent with schedule-multiple on to do the trick, but gdb seems to still want to stop when an inferior exits.
I guess I'm looking for something like a "skip proc-exit", or a virtual signal I can change the handler policy on so it doesn't stop.
set non-stop on seems like it should be the right answer, but I suspect it's broken for multiple inferiors.
If I use non-stop on, then after the first exit trap, gdb's internal state indicates that inferior 1 is running:
(gdb) info inferior
Num Description Executable
* 1 process 20540 /usr/bin/make
(gdb) info thread
Id Target Id Frame
* 1 process 20540 "make" (running)
(gdb) cont
Continuing.
Cannot execute this command while the selected thread is running.
but the kernel sees it as blocked on ptrace_stop:
$ ps -o "cmd,wchan" -p 20540
CMD WCHAN
/usr/bin/make check ptrace_stop
... and it makes no progress until gdb is detached, or it's killed. Signals to the process are ignored, and interrupt in gdb has no effect.
I'm using GNU gdb (GDB) Fedora 7.7.1-18.fc20 on x86_64.
After stumbling on a post that references it in passing I found that the missing magic is set target-async on alongside set non-stop on.
non-stop mode, as expected, means gdb won't stop everything whenever an inferior exits. target-async is required to make it actually work correctly on gdb 7.7; it's the default on 7.8.
So the full incantation is:
set detach-on-fork off
set schedule-multiple on
set follow-fork-mode parent
set non-stop on
set target-async on
For 7.8, remove target-async on and, to reduce noise, add set print symbol-loading off.
The following Python extension to gdb will switch back to the first inferior and resume execution after each stop.
It feels like a total hack, but it works. When a process exits it sets a flag indicating that it stopped on an exit, then switches to the original process. gdb will then stop execution, delivering a stop event. We check to see if the stop was caused by our stop event and if so, we immediately continue.
The code also sets up the breakpoints I'm using and the multi-process settings, so I can just source thescript.py and run .
gdb.execute("set python print-stack full")
gdb.execute("set detach-on-fork off")
gdb.execute("set schedule-multiple on")
gdb.execute("set follow-fork-mode parent")
gdb.execute("set breakpoint pending on")
gdb.execute("break PostgresMain")
gdb.execute("break PostmasterMain")
gdb.execute("set breakpoint pending off")
def do_continue():
gdb.execute("continue")
def exit_handler(event):
global my_stop_request
has_threads = [ inferior.num for inferior in gdb.inferiors() if inferior.threads() ]
if has_threads:
has_threads.sort()
gdb.execute("inferior %d" % has_threads[0])
my_stop_request = True
gdb.events.exited.connect(exit_handler)
def stop_handler(event):
global my_stop_request
if isinstance(event, gdb.SignalEvent):
pass
elif isinstance(event, gdb.BreakpointEvent):
pass
elif my_stop_request:
my_stop_request = False
gdb.post_event(do_continue)
gdb.events.stop.connect(stop_handler)
There must be an easier way than this. It's ugly.
My current script looks like this:
cd ~/.wine/drive_c/
echo "test123" > foo$$.txt
wine start "C:\foo$$.txt"
wineserver -w
echo "Wine is done!"
which works fine when only one program is running in wine at a time. However if I run this a second time, before the first program is done, both scripts will wait for each others programs to exit.
This does not work:
cd ~/.wine/drive_c/
echo "test123" > foo$$.txt
$(wine start "C:\foo$$.txt") &
wait ${!}
echo "Wine is done!"
as it will exit before you close the text editor.
I need to use the start command, because I want a file to be run with its default editor/viewer.
To wait for the process started by wine to exit, you can pipe its output to another program.
On my system, the following achieves the desired effect:
wine "program.exe" | cat
echo "program.exe has finished"
If you want to wait asynchronously:
wine "program.exe" | cat & pid=$!
# ...
wait $pid
echo "program.exe has finished"
wineserver has a --wait flag which can be used to do exactly that.
However if you run multiple programs at once, it will wait for all of them to finish.
wine <program> waits until the program exits. wine start program does not.
A summary:
wine <program> starts the program and waits until it is finished. I recommend using this method.
wine start <program> starts the program and immediately exits without waiting. The program will keep running in the background.
wine start \wait <program> starts the program and waits until it is finished. This is the same behavior as wine <program>.
wineserver --wait waits until all programs and all services in Wine are finished. This command does not launch any program itself but waits for existing programs and services.
Services like services.exe, plugplay.exe, and winedevice.exe keep on running a few seconds after the last program finishes, and wineserver --wait also waits until these services exit.
Some of these services hold state and write their state (and the registry) to disk when they exit. So if you want to backup or remove your wine prefix, make sure to wait until these services have exited.
What happens is that wine just asks wineserver to start the program and exits, and I have found no good mechanism to get notifications from wineserver about the processes that it spawns.
My suggestion would be to wait for the completion of the process started by wineserver using one of the methods in How to wait for exit of non-children processes, but you need to know its PID. Possible ideas: run wineserver -f -d | grep init_thread( and get PIDs from there, but I can see no easy way to find out which is which, to avoid race conditions, and to ignore noise; or try to find your process in the output of ps, but it's ugly, and definitely not robust.
If nothing better surfaces, you might want to suggest the addition of such a feature to the Wine devs (probably as a flag to wine).