Attach to and debug an already running Ruby script on Linux?

Attach to and debug an already running Ruby script on Linux? - ruby

I'm on Linux with a Ruby script running in a terminal window (it sits in a while loop with some sleep timeout, and does work when something changes.)
The problem is, occasionally the script seems to freeze and stop responding. A typical scenario is if I leave it sitting overnight.
If I break and restart it, the script works fine.
So, 1) is there a way to attach to this already-running ruby script's interpreter to find out where it's getting stuck? Ideally I'd get a stack trace.
If not possible on-the-fly, 2) how can I run it so that next time it freezes I can get a stack?

I think there are likely better "ruby ways" to solve this problem. But doing a google search on attach to a running ruby process turned up a blog post with some helpful suggestions for using gdb to debug a live Ruby process on Linux: Tools for Debugging Running Ruby Processes. This further linked to another blog post with some useful information on using gdb to get a ruby stack trace:
Find the PID of your ruby script, e.g.
ps aux | grep -i <script_name.rb>
Attach to it with gdb:
sudo gdb `which ruby` <pid>
Run these commands in gdb to get the Ruby backtrace:
(gdb) set $ary = (int)backtrace(-1)
(gdb) set $count = *($ary+8)
(gdb) set $index = 0
(gdb) while $index < $count
> x/1s *((int)rb_ary_entry($ary, $index)+12)
> set $index = $index + 1
>end
This got me close, but gdb encountered an error while loading ruby symbols, and another error trying to run the backtrace function. I'll update this answer as I figure out more. Feel free to make other suggestions.
The blogpost also links an interesting set of gdb recipes for debugging Ruby.

Related

Debugging GDB itself and signal handling issues

I am trying to debug GDB itself and dealing with a Ctrl+C signal problem that is sent from another terminal.
I run GDB to be debugged in terminal 1 in TUI mode. Right after, I open another terminal 2 and find the PID number of the GDB that runs on Terminal 1. Then attach that process to debug.
In Terminal 1
$ build-gdb/gdb/gdb -tui ./build/output.elf -tty=$TTY
In Terminal 2
$ ps -elf | less
$ sudo gdb -p PID_NUMBER-tty=$TTY -tui
The problem is when I hit Ctrl+C to stop GDB in terminal 1, GDB runs on Terminal 2 stops. GDB in Terminal 1 does not responds to ^C command at all. I tried to use -tty parameter and get the current TTY, but id did not solved the problem. GDB uses readline GNU library, but i should be configure terminal and its input properly.
Any idea?

You can use
handle SIGINT pass
to instruct GDB to pass the signal to the inferior. See Signals in the GDB manual. The nostop argument could be useful in this situation, too.

Trouble using the `at` command on mac

I'm unable to get the at command to run from the Mac Terminal.
I've tried at -f test.txt 10:44 which puts it in the queue, but then it never runs.
I've tried sh test.txt | at 10:43 which puts it in the queue (though it never runs), but it also runs the script immediately.
Just running sh test.txt runs fine, it is a test script to send an email.
What am I doing wrong?

I think the answer may be in the man page. man at, IMPLEMENTATION
NOTES Note that at is implemented through the launchd(8) daemon
periodically invoking atrun(8), which is disabled by default. See
atrun(8) for information about enabling atrun.
This is from the comment by Justin Wood.
I thank him for his help - I should have read the man page more closely.

How to pause the execution of a program after 10 seconds and get a backtrace?

A legacy program most likely gets into an infinite loop on certain pathological inputs. I have >1000 such instances, however, I suspect that the vast majority of them trigger the same bug. Therefore, I would like to reduce the >1000 instances to the fundamentally different ones. The first step is to pause the application after, say, 10 seconds and collect the backtrace.
If I run:
gdb --batch --command=backtrace.txt --args ./legacy_program
with backtrace.txt
run
bt
and I hit Ctrl + C after 10 seconds in the same terminal I get exactly the backtrace I want.
Now, I would like to do that automatically. I have tried sending SIGINT (the expected equivalent of Ctrl + C) from another terminal but I do not get the backtrace anymore. Here are some of my failed attempts based on
GDB how to stop execution without a breakpoint?
Neither of these have any effect:
pkill -SIGINT gdb
kill -SIGINT 5717
where 5717 is the pid of the only gdb running. Sending SIGINT to the legacy_program the same way does kill it but then I do not get the backtrace:
Program received signal SIGINT, Interrupt.
Quit
How can I programmatically pause the execution of the legacy_program after 10 seconds and get a backtrace?
This post was motivated by my frustration not being able to find an answer to this question here at StackOverflow.
Also note that
[it is not merely OK to ask and answer your own question, it is explicitly encouraged.](https://blog.stackoverflow.com/2011/07/its-ok-to-ask-and-answer-your-own-questions/)

Apparently, it is a known (bug) feature in gdb, see
GDB is not trapping SIGINT. Ctrl+C terminates program when should break gdb. Try sending SIGSTOP instead from the other terminal:
pkill -STOP legacy_program
It works on my machine.
Note that you do not have to run the legacy_program in the debugger. Enable core dumps
ulimit -c unlimited
and send the program SIGTRAP to make it crash, then get the backtrace from the core dump. So, start the program:
./legacy_program
From another terminal:
pkill -TRAP legacy_program
The backtrace can be obtained like this:
gdb --batch -ex=bt ./legacy_program core

LLDB Restart process without user input

I am trying to debug a concurrent program in LLDB and am getting a seg fault, but not on every execution. I would like to run my process over and over until it hits a seg fault. So far, I have the following:
b exit
breakpoint com add 1
Enter your debugger command(s). Type 'DONE' to end.
> run
> DONE
The part that I find annoying, is that when I get to the exit function and hit my breakpoint, when the run command gets executed, I get the following prompt from LLDB:
There is a running process, kill it and restart?: [Y/n]
I would like to automatically restart the process, without having to manually enter Y each time. Anyone know how to do this?

You could kill the previous instance by hand with kill - which doesn't prompt - then the run command won't prompt either.
Or:
(lldb) settings set auto-confirm 1
will give the default (capitalized) answer to all lldb queries.
Or if you have Xcode 6.x (or current TOT svn lldb) you could use the lldb driver's batch mode:
$ lldb --help
...
-b
--batch
Tells the debugger to running the commands from -s, -S, -o & -O,
and then quit. However if any run command stopped due to a signal
or crash, the debugger will return to the interactive prompt at the
place of the crash.
So for instance, you could script this in the shell, running:
lldb -b -o run
in a loop, and this will stop if the run ends in a crash rather than a normal exit. In some circumstances this might be easier to do.

Can a standalone ruby script (windows and mac) reload and restart itself?

I have a master-workers architecture where the number of workers is growing on a weekly basis. I can no longer be expected to ssh or remote console into each machine to kill the worker, do a source control sync, and restart. I would like to be able to have the master place a message out on the network that tells each machine to sync and restart.
That's where I hit a roadblock. If I were using any sane platform, I could just do:
exec('ruby', __FILE__)
...and be done. However, I did the following test:
p Process.pid
sleep 1
exec('ruby', __FILE__)
...and on Windows, I get one ruby instance for each call to exec. None of them die until I hit ^C on the window in question. On every platform I tried this on, it is executing the new version of the file each time, which I have verified this by making simple edits to the test script while the test marched along.
The reason I'm printing the pid is to double-check the behavior I'm seeing. On windows, I am getting a different pid with each execution - which I would expect, considering that I am seeing a new process in the task manager for each run. The mac is behaving correctly: the pid is the same for every system call and I have verified with dtrace that each run is trigging a call to the execve syscall.
So, in short, is there a way to get a windows ruby script to restart its execution so it will be running any code - including itself - that has changed during its execution? Please note that this is not a rails application, though it does use activerecord.

After trying a number of solutions (including the one submitted by Byron Whitlock, which ultimately put me onto the path to a satisfactory end) I settled upon:
IO.popen("start cmd /C ruby.exe #{$0} #{ARGV.join(' ')}")
sleep 5
I found that if I didn't sleep at all after the popen, and just exited, the spawn would frequently (>50% of the time) fail. This is not cross-platform obviously, so in order to have the same behavior on the mac:
IO.popen("xterm -e \"ruby blah blah blah\"&")

The classic way to restart a program is to write another one that does it for you. so you spawn a process to restart.exe <args>, then die or exit; restart.exe waits until the calling script is no longer running, then starts the script again.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio