ptrace(PTRACE_SINGLESTEP) + waitpid = SIGCHLD - ptrace

I'm ptracing a multithreaded application and 9 out of 10 times, the breakpointhandling works just fine, but sometimes i get a SIGCHLD event instead of SIGTRAP.
This is the sequence:
application is running, main thread hits INT3
debugger's waitpid returns SIGTRAP
debugger SIGSTOPs all threads that are not already "t (tracing stop)", using tgkill
debugger runs ptrace(PTRACE_SINGLESTEP) on INT3'ed thread (after fixing RIP and 0xCC byte)
debugger waitpid's and expects SIGTRAP, but gets SIGCHLD instead
What am I supposed to do with this SIGCHILD? Ignoring it makes the debugger stuck forever in following waitpids. Injecting it back into the debugee with PTRACE_CONT screws with the initial PTRACE_SINGLESTEP.
It seems that it is happening only for main threads (PID==TID), not for childthreads (aka LWP).
I'm using UBUNTU 12.04 64bit in virtual box.

Injecting SIGCHLD with PTRACE_SINGLESTEP (data param) back into debugee seems todo the trick.

Related

Job control in Ruby - SIGCONT handlers not working, SIGTSTP handler working only for irb. What am I missing?

I was working on trying to implement some kind of shell job control for a custom event loop handler with the GLib2 API in Ruby-GNOME. Ideally, this would be able to handle SIGTSTP and SIGCONT signals, to background the process at a TTY when running under a shell and to resume the background process on 'fg' from the shell.
I've not been able to figure out how to completely approach this with the API available in Ruby.
For a simpler usage case, I thought that I'd try adding a similar job support for IRB. I've added the following to my ~/.irbrc. The SIGTSTP handler seems to work, but the process remains suspended even after SIGCONT from fg in BASH.
## conditional section for ~/.irbrc
## can be activated with `IRB_JOBS_TEST=Defined irb`
if ENV['IRB_JOBS_TEST']
module Jobs
TSTP_HDLR_ORIG ||= Signal.trap("TSTP") do
STDERR.puts "\nJobs: backgrounding #{Process.pid} (#{TSTP_HDLR_ORIG.inspect}, #{CONT_HDLR_ORIG.inspect})"
Process.setpgid(0, Process.ppid)
TSTP_HDLR_ORIG.call if TSTP_HDLR_ORIG.respond_to?(:call)
end
CONT_HDLR_ORIG ||= Signal.trap("CONT") do
Process.setpgid(0, Process.pid)
STDERR.puts "Continuing in #{Process.pid}" ## not reached, not shown
IRB.CurrentContext.thread.wakeup ## no effect
CONT_HDLR_ORIG.call if CONT_HDLR_ORIG.respond_to?(:call)
end
end
end
I'm testing this on FreeBSD 13.1. I've read the FreeBSD termios(4), tcsetpgrp(3), and fcntl(2) manual pages. I'm not sure how much of the terminal-related API is available in Ruby.
The TSTP handler here seems to work, but the CONT handler is apparently not ever reached. I'm not sure if the TSTP handler is actually doing enough for - in effect - backgrounding the process in the shell's process group and relinquishing the controlling terminal.
With that TSTP handler, I can then background the IRB process in the shell with Ctrl-z. I can also foreground the process with 'fg' or BASH '%', but then the process is unresponsive. FreeBSD's Ctl-t handler shows the process as suspended. Apparently nothing in my CONT handler is reached.
I'm really stumped about what's failing in this approach - what my TSTP/CONT handlers are missing, what's available in Ruby, and why the process stays suspended after 'fg' in the shell.
In a more complex example, with the code I've written for glib2 it was apparently not enough to just call
Process.setpgid(0, Process.ppid)
as the process was not being backgrounded then. This would probably need another question though, as the example code for it isn't quite so short. So, I thought I'd try starting with IRB ...
After trying to foreground the process, then with Ctrl-t at the TTY on FreeBSD, I'm seeing the following
$ %
IRB_JOBS_TEST=Defined irb
load: 0.16 cmd: ruby31 4076 [suspended] 2.36r 0.19u 0.03s 1% 23828k
mi_switch+0xc2 thread_suspend_check+0x260 sleepq_catch_signals+0x113 sleepq_wait_sig+0x9 _cv_wait_sig+0xec tty_wait_background+0x30d ttydev_ioctl+0x14b devfs_ioctl+0xc6 vn_ioctl+0x1a4 devfs_ioctl_f+0x1e kern_ioctl+0x25b sys_ioctl+0xf1 amd64_syscall+0x10c fast_syscall_common+0xf8
So, it's blocking in an ioctl on resume?
Update
After a few hours of ineffectual hacking about this, I've removed the SIGTSTP and SIGCONT signal handlers from my GLib example code and now it "Just Works". I can background the example app at the console ... at least when it's not running under IRB ... and I can bring it back to the process group foreground with the shell. It resumes running on SIGCONT and everything looks alright in the logging from its main event loop.
I'm still not certain what the missing parts may have been, in may handlers/hacks for SIGTSTP and SIGCONT with IRB. Of course, with the input history recording in IRB it's typically simple enough to just restart the process..
Looking at how other applications have approached job control at the console, I think Emacs wraps its TTY I/O streams in some kind of an encapsulated struct? looking at Emacs' terminal.c mainly.
Glad to see if there's job control available in Ruby though, and it does not even need a custom signal handler for some applications?

linux kernel check if process is still running

I'm working in kernel space and I want to find out when an application has stopped or crashed.
When I receive an ioctl call, I can get the struct task_struct where I have a lot of information regarding the process of the application.
My problem is that I want to periodically check if the process is still alive or better yet, to have some asynchronous call when the process is killed.
My test environment was on QEMU and after a while in the application I've run a system("kill -9 pid"). Meanwhile in the kernel I've had a periodical check on task_struct with:
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
static inline int pid_alive(struct task_struct *p)
The problem is that my task_struct pointer seems to be unmodified. Normally I would say that each process has a task_struct and of course it is corespondent with the process state. Otherwise I don't see the point of "volatile long state"
What am I missing? Is it that I'm testing on QEMU, it is that I've tested checking the task_struct in a while(1) with an msleep of 100? Any help would be appreciated.
I would be partially happy if I could receive the pid of the application when the app is closing the file descriptor of the module ("/dev/driver").
Thanks!
You cannot hive off the task_struct pointer and refer to it later. If the process has been killed, the pointer is no longer valid - that task_struct is gone. You also should not be using PID values within the kernel to refer to processes. PID values are re-used, so you might not even be talking about the same process.
Your driver can supply a .release callback, which will be called when your driver file is closed, including if the process is terminated or killed. You can access current from this callback. Note that if a process opens your file and then forks, the process calling .release could well be different from the process that called .open. Your driver must be able to handle this.
It has been a long time since I mucked around inside the kernel. It seems to me if your process actually dies, then your best bet would be to put hooks into the code that tears down processes. If it doesn't die but gets caught in a non-responsive loop, you'd probably be better off causing an application level core dump.
A solution that worked beautifully in my operating systems homework is to use a kprobe to detect when do_exit is called. What's beautiful is that do_exit will always be called, no matter how the process is closed. I think even in the case of a kernel oops this one will still be called.
You should also hook into _do_fork, just in case.
Oh, and look at the .release callback mentioned in the other answer (do note that dup2 and fork will cause unexpected behavior -- you will only be notified when the last of the copies created by these two is closed).

How to debug malware injected code?

I am debugging a malware do injection to Notepad.exe use following approach:
CreateProcess(notepad.exe , create_suspend)
GetThreadContext
VirtualProtectEx
WriteProcessMemory(address=1000000, Size:10200)
WriteProcessMemory(address=7FFD8008, Size:4)
SetThreadContext
ResumeThread
There is no pid to attach Notepad.exe to debugger before it resume.
after resume, the thread run so fast that I can't attach to ollydgb in time.
I Dump memory and save it as PE from what it write to Notepad.exe,
but it run with error.
so how to debug malware injected code? thanks!!
You should modify the first byte of the injected code to 'int 3' (opcode is cc) before invoking WriteProcessMemory.
OD can't attach to the process that hasn't started the main thread, use WinDbg instead.
Invoke ResumeThread after WinDbg is attached to the subprocess.
Press F5 to let the main thread run.
The main thread will stop when it sees 'int 3', now you should change the byte to the original value. For example: eb addr_to_change 55. PS: opcode 55 means 'push ebp', which is the most common instruction executed at the beginning of one function.
Now, Press F10 to start single-step debugging.
After CreateProcess returns, the process should already exist and you should be able to attach to it. Another approach is to skip the ResumeThread call and attach at that point.

In the linux kernel, where is the first process initialized?

I'm looking for the code in the linux kernel (2.4.x) that initializes the first process, pid=0.
Many searches provided many clues, but I still cannot find it.
Any pointers, anyone?
The initial task struct is set up by the macro INIT_TASK(), defined in include/linux/init_task.h. All other task structs are created by do_fork.
start_kernel()
check out rest_init() at the end
// idle process, pid = 0
cpu_idle(); // never return
The first process that the kernel initializes is the swapper process or the idle thread. This thread runs forever. When no other process is active in the system, then this thread [which is cpu_idle() function found in arch/arm/kernel/process.c for the ARM architecture] calls the architecture dependent pm_idle function, which power collapses the CPU until a timer interrupt or some other interrupt wakes it up.
The swapper process [pid=0] is initialized in arch/arm/kernel/init_task.c by the macro INIT_TASK.

Is it necessary to explicitly stop all threads prior to exiting a Win32 application?

I have a Win32 native VC++ application that upon entering WinMain() starts a separate thread, then does some useful job while that other thread is running, then simply exits WinMain() - the other thread is not explicitly stopped.
This blog post says that a .NET application will not terminate in this case since the other thread is still running. Does the same apply to native Win32 applications?
Do I have to stop all threads prior to exiting?
Yes, you have to if you are simply exiting or terminating the main thread via ExitThread or TerminateThread, otherwise your application may not fully shutdown. I recommend reading Raymond Chen's excellent blog posts on this topic:
The old-fashioned theory on how processes exit
Quick overview of how processes exit on Windows XP
How my lack of understanding of how processes exit on Windows XP forced a security patch to be recalled
During process termination, the gates are now electrified
If you return from the main thread, does the process exit?
But please note in particular that if you properly return from the main or WinMain function, the process will exit as described by the ExitProcess API documentation and the last post by Raymond Chen that is being linked above!
The short of it is:
For a native Win32 process to terminate, one of two conditions must be met:
Someone calls ExitProcess or TerminateProcess.
All the threads exit (by returning from their ThreadProc (including the WinMainEntryPoint that is the first thread created by windows)), close (by calling ExitThread), or terminated (someone calls TerminateThread).
(The first condition is actually the same as the 2nd: ExitProcess and TerminateProcess, as part of their cleanup, both call TerminateThread on each thread in the process).
The c-runtime imposes different conditions: For a C/C++ application to terminate, you must either:
return from main (or WinMain).
call exit()
Calling exit() or returning from main() both cause the c-runtime to call ExitProcess(). Which is how c & c++ applications exit without cleaning up their threads. I, personally, think this is a bad thing.
However, non trivial Win32 processes can never terminate because many perfectly, otherwise reasonable, Win32 subsystems create worker threads. winsock, ole, etc. And do not provide any way to cause those threads to spontaneously close.
No, when WinMain returns, the process will be terminated, and this means all threads spawned by the process should be terminated though they might not be closed gracefully.
However, it is possible that a primary thread is terminated while the other threads are running, resulting in the application is still running. If you call ExitThread (not exit or ExitProcess) in WinMain, and there are running threads (eventually created by the primary thread), then, you may observe this behavior. Nonetheless, just return in WinMain will call ExitProcess, and that means all threads are should be terminated.
Correct me if it's wrong.
I think you can first close all your windows(so the user won't see your application), and then set a flag for exit, your thread should check the flag periodicly, and once found set, the thread should return.
after set the flag, your main thread could call ::WaitForSingleObject() or ::WaitForMultipleObjects() for a while (say, three seconds), if the thread(s) not return, just kill them by ::TerminateThread().
Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
short answer : yes

Resources