Ruby - fork, exec, detach .... do we have a race condition here? - ruby

Simple example, which doesn't work on my platform (Ruby 2.2, Cygwin):
#!/usr/bin/ruby
backtt = fork { exec('mintty','/usr/bin/zsh','-i') }
Process.detach(backtt)
exit
This tiny program (when started from the shell) is supposed to span a terminal window (mintty) and then get me back to the shell prompt.
However, while it DOES create the mintty window, I don't have a shell prompt afterwards, and I can't type anything in the calling shell.
But when I introduce a small delay before the detach, either using 'sleep', or by printing something on stdout, it works as expected:
#!/usr/bin/ruby
backtt = fork { exec('mintty','/usr/bin/zsh','-i') }
sleep 1
Process.detach(backtt)
exit
Why is this necessary?
BTW, I'm well aware that I could (from the shell) do a
mintty /usr/bin/zsh -i &
directly, or I could use system(...... &) from inside Ruby, but this is not the point here. I'm particularily interested in the fork/exec/detach behaviour in Ruby. Any insights?

Posting as an answer, because it is too long for a comment
Although I am no specialist in Ruby, and do not know Cygwin at all, this situation sounds very familiar to me, coming from C/C++.
This script is too short, so the parent of the parent completes, while the grandchild tries to start.
What would happen if you put the sleep after detach and before exit?
If my theory is correct, it should work too. Your program exits before any (or enough) thread-switching happens.
I call such problems "interrupted hand shaking". Although this is psychology terminology, it describes what happens.
Sleep "gives up the time slice", leading to thread-switching,
Console output (any file I/O) runs into semaphores, also leading to thread switching.
If my idea is correct, it should also work, if you dont "sleep", just count to 1e9 (depending on the speed of computation) because then preemptive multitasking makes even the thread-switch itself not giving up the CPU.
So it is an error in programming (IMHO: race condition is philosophical in that case), but it will get hard to find "who" is responsible. There are many things involved.

According to the documentation:
Process::detach prevents this by setting up a separate Ruby thread whose sole job is to reap the status of the process pid when it terminates.
NB: I can’t reproduce this behaviour on any of available to me operating systems, and I’m posting this as an answer just for the sake of formatting.
Since Process.detach(backtt) transparently creates a thread, I would suggest you to try:
#!/usr/bin/ruby
backtt = fork { exec('mintty','/usr/bin/zsh','-i') }
# ⇓⇓⇓⇓⇓
Process.detach(backtt).join
exit
This is no hack by any mean (as opposite to silly sleep,) since you are likely aware of that the underlying command should return more-or-less immediately. I am not a guru in cygwin, but it might have some specific issues with threads, so, let this thread to be handled.

I'm neither a Ruby nor a Cygwin guy, so what I propose here may not work at all. Anyways: I guess, you're not even hitting a Ruby or Cygwin specific bug here. In a program called "start" I've written in C many years ago, I hit the same issue. Here is a comment from the start of the function void daemonize_now():
/*
* This is a little bit trickier than I expected: If we simply call
* setsid(), it may fail! We have to fork() and exit(), and let our
* child call setsid().
*
* Now the problem: If we fork() and exit() immediatelly, our child
* will be killed before it ever had been run. So we need to sleep a
* little bit. Now the question: How long? I don't know an answer. So
* let us being killed by our child :-)
*/
So, he strategy is this: Let the parent wait on it's child (that can be done immediately before the child actually had a chance to do anything) and then let the child do the detaching part. How? Let it create a new process group (it will be reparented to the init process). That's the setsid() call for, I'm talking about in the comment. It will work something like this (C-Syntax, you should be able to lookup the correct usage for Ruby and apply the needed changes yourself):
parentspid = getpid();
Fork = fork();
if (Fork) {
if (Fork == -1) { // fork() failed
handle error
} else { // parent, Fork is the pid of the child
int tmp; waitpid(0, &tmp, 0);
}
} else { // child
if (setsid() == -1) {
handle error - possibly by doing nothing
and just let the parent wait ...
} else {
kill(parentspid, SIGUSR1);
}
exec(...);
}
You can use any signal, that terminates the process (i.e. SIGKILL). I used SIGUSR1 and installed a signal handler that exit(0)s the parent process, so the caller gets a success message. Only caveat: You get a success even if the exec fails. However, that is a problem that can't really be worked around, since after a successful exec you can't signal your parent anything anymore. And since you don't know when the exec will have failed (if it fails), you're back at the race condition part.

Related

Is there a race between starting and seeing yourself in WinApi's EnumProcesses()?

I just found this code in the wild:
def _scan_for_self(self):
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
basename = self.cmdline.split()[0]
pids = win32process.EnumProcesses()
if not pids:
UserLog.warn("WindowsProcess", "no pids", pids)
for pid in pids:
try:
handle = win32api.OpenProcess(
win32con.PROCESS_QUERY_INFORMATION | win32con.PROCESS_VM_READ,
pywintypes.FALSE, pid)
except pywintypes.error, err:
UserLog.warn("WindowsProcess", str(err))
continue
try:
modlist = win32process.EnumProcessModules(handle)
except pywintypes.error,err:
UserLog.warn("WindowsProcess",str(err))
continue
This line caught my eye:
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
It suggests that if you call EnumProcesses() too fast after starting, you won't see yourself. Is there any truth to this?
There is a race, but it's not the race the code tried to protect against.
A successful call to CreateProcess returns only after the kernel object representing the process has been created and enqueued into the kernel's process list. A subsequent call to EnumProcesses accesses the same list, and will immediately observe the newly created process object.
That is, unless the process object has since been destroyed. This isn't entirely unusual since processes in Windows are initialized in-process. The documentation even makes note of that:
Note that the function returns before the process has finished initialization. If a required DLL cannot be located or fails to initialize, the process is terminated.
What this means is that if a call to EnumProcesses immediately following a successful call to CreateProcess doesn't observe the newly created process, it does so because it was late rather than early. If you are late already then adding a delay will only make you more late.
Which swiftly leads to the actual race here: Process IDs uniquely identify processes only for a finite time interval. Once a process object is gone, its ID is up for grabs, and the system will reuse it at some point. The only reliable way to identify a process is by holding a handle to it.
Now it's anyone's guess what the author of _scan_for_self was trying to accomplish. As written, the code takes more time to do something that's probably altogether wrong1 anyway.
1 Turns out my gut feeling was correct. This is just your average POSIX developer, that, in the process of learning that POSIX is insufficient would rather call out Microsoft instead of actually using an all-around superior API.
The documentation for EnumProcesses (WIn32 API - EnumProcesses function), does not mention anything about a delay needed to see the current process in the list it returns.
The example from Microsoft how to use EnumProcess to enumerate all running processes (Enumerating All Processes), also does not contain any delay before calling EnumProcesses.
A small test application I created in C++ (see below) always reports that the current process is in the list (tested on Windows 10):
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
#include <vector>
const DWORD MAX_NUM_PROCESSES = 4096;
DWORD aProcesses[MAX_NUM_PROCESSES];
int main(void)
{
// Get the list of running process Ids:
DWORD cbNeeded;
if (!EnumProcesses(aProcesses, MAX_NUM_PROCESSES * sizeof(DWORD), &cbNeeded))
{
return 1;
}
// Check if current process is in the list:
DWORD curProcId = GetCurrentProcessId();
bool bFoundCurProcId{ false };
DWORD numProcesses = cbNeeded / sizeof(DWORD);
for (DWORD i=0; i<numProcesses; ++i)
{
if (aProcesses[i] == curProcId)
{
bFoundCurProcId = true;
}
}
std::cout << "bFoundCurProcId: " << bFoundCurProcId << std::endl;
return 0;
}
Note: I am aware that the fact that the program reported the expected result does not mean that there is no race. Maybe I just couldn't catch it manifest. But trying to run code like that can give you a hint sometimes (especially if the result would have been that there is a race).
The fact that I never had a problem running this test (did it many times), together with the lack of any mention of the need for a delay in Microsoft's documentation make me believe that it is not required.
My conclusion is that either:
There is a unique issue when using it from python (doubt it).
or:
The code you found is doing something unnecessary.
There is no race.
EnumProcesses calls a NT API function that switches to kernel mode to walk the linked list of processes. Your own process has been added to the list before it starts running.

Unix: fork and wait

Here is the last part I do not understand in the source code of the command if.
Source: http://v6shell.org/history/if.c, with Syntax-Highlighting: http://pastebin.com/bj0Hvfrw
if(eq(a, "{")) { /* execute a command for exit code */
if(fork()) /*parent*/ wait(&ccode);
else { /*child*/
doex(1);
goto err;
}
while((a=nxtarg()) && (!eq(a,"}")));
return(ccode? 0 : 1);
}
As described in the man-page (http://man.cat-v.org/unix-6th/1/if), if we put the command in brackets "if expr { command } ", we can obtain his exit code.
So we fork the current process, and then wait for our child process to finish? But where is our child process continuing his work? After the fork, we will go into the while-loop and and just skip some arguments and then return with ccode? Where was ccode changed? What is ccode?
Could you please explain me this the given code snippet?
And elaborate on ccode?
The man page of wait: http://man.cat-v.org/unix-6th/2/wait
fork splits the current process in two: it makes a new process, running the same code, which begins running from the same point as the fork call. fork returns a different value in the parent and the child: in the parent, it returns the PID of the child process, and in the child it returns zero. The PID is a true value, so the wait call only executes in the parent (as the comment says), and the "else" branch only executes in the child (as its comment says). Both processes execute in parallel from the point of the fork onwards.
doex performs an exec of another program, replacing the child process and terminating with the new process's exit code. Only the doex call and execv execute in the child process from the current program.
wait:
causes its caller to delay until one of its child processes terminates.
That is, it causes the parent to pause until the child has exited. It is passed a pointer to an int variable and writes exit information for the child process into that variable. ccode is defined elsewhere in the enclosing function. The child process's exit code will be the exit code of the command that was execed.
When ccode has been given a non-zero value, that indicates an error running the program. In that case, the function returns zero, and otherwise it returns 1 to indicate success to its caller.
I encourage you take a look at either POSIX/the Single Unix Specification, the ISO C standard, or a standard C programming textbook to help understand what's going on in this codebase. The man pages that you link to also describe what the functions do, but often the newer versions fill in the gaps or are generally clearer, and the behaviour hasn't changed too much.
Although all of these questions are related to historical Unix, and the interplay between Unix and C at that point combined with the subsequent changes to both in the intervening time makes them arguably on-topic, they're also rudimentary programming questions (and so arguably off-topic).

fork()/exec() in XWindow application

How to execute xterm from XWindow program, insert it into my window, but continue execution both while xterm is active and after it was closed?
In my XWindows (XLib over XCB) application I want to execute xterm -Into <handle>. So that my window contains xterm window in it. Unfortunately something wrong is happening.
pseudo code:
if (fork() == 0) {
pipe = popen('xterm -Into ' + handle);
while (feof(pipe)) gets(pipe);
exit(0);
}
I tired system() and execvp() as well. Every thing is fine until I exit from bash that runs in xterm, then my program exits. I guess that connection to X server is lost because it is shared between parent and child.
UPDATE: here is what is shown on terminal after program exits (or rather crashes).
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"
after 59 requests (59 known processed) with 1 events remaining.
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
y: ../../src/xcb_io.c:274: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Aborted
One possibility is that you are terminating due to the SIGCHLD signal not
being ignored and causing your program to abort.
signal(SIGCHLD, SIG_IGN);
Another is, as you suspect something actively closing the X session. Just
closing the socket itself should not matter but are you using a library that
registers an atexit call it could cause an issue.
Since from your snippet,
it looks like you don't actually care about the stdout of the xterm, a
better way to do it would be to actuall close fd's 0,1,2. Also since it looks
like you don't need to do anything in the child process after xterm
terminates you can use 'exec' rather than 'popen' to fully replace the
child process with that of the xterm including any cleanup handlers that
were left around. Though, I am not sure how pruned your snippet is from what you want to do as obviously the call to 'gets' is not what you want.
to make sure the X connection is closed, you can set its close on exec flag
with the following. (this will work on POSIX systems where the x connection
number is the fd of the server socket)
fcntl(XConnectionNumber(display), F_SETFD, fcntl(XConnectionNumber(display), F_GETFD) | FD_CLOEXEC);
Also note that 'popen' itself forks in the background in addition to your fork, I think you probably want to do an execvp there then use waitpid(... , WNOHANG) to check for the childs termination in your main X11 loop if you care to know when it exited.

linux kernel check if process is still running

I'm working in kernel space and I want to find out when an application has stopped or crashed.
When I receive an ioctl call, I can get the struct task_struct where I have a lot of information regarding the process of the application.
My problem is that I want to periodically check if the process is still alive or better yet, to have some asynchronous call when the process is killed.
My test environment was on QEMU and after a while in the application I've run a system("kill -9 pid"). Meanwhile in the kernel I've had a periodical check on task_struct with:
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
static inline int pid_alive(struct task_struct *p)
The problem is that my task_struct pointer seems to be unmodified. Normally I would say that each process has a task_struct and of course it is corespondent with the process state. Otherwise I don't see the point of "volatile long state"
What am I missing? Is it that I'm testing on QEMU, it is that I've tested checking the task_struct in a while(1) with an msleep of 100? Any help would be appreciated.
I would be partially happy if I could receive the pid of the application when the app is closing the file descriptor of the module ("/dev/driver").
Thanks!
You cannot hive off the task_struct pointer and refer to it later. If the process has been killed, the pointer is no longer valid - that task_struct is gone. You also should not be using PID values within the kernel to refer to processes. PID values are re-used, so you might not even be talking about the same process.
Your driver can supply a .release callback, which will be called when your driver file is closed, including if the process is terminated or killed. You can access current from this callback. Note that if a process opens your file and then forks, the process calling .release could well be different from the process that called .open. Your driver must be able to handle this.
It has been a long time since I mucked around inside the kernel. It seems to me if your process actually dies, then your best bet would be to put hooks into the code that tears down processes. If it doesn't die but gets caught in a non-responsive loop, you'd probably be better off causing an application level core dump.
A solution that worked beautifully in my operating systems homework is to use a kprobe to detect when do_exit is called. What's beautiful is that do_exit will always be called, no matter how the process is closed. I think even in the case of a kernel oops this one will still be called.
You should also hook into _do_fork, just in case.
Oh, and look at the .release callback mentioned in the other answer (do note that dup2 and fork will cause unexpected behavior -- you will only be notified when the last of the copies created by these two is closed).

How can a C/C++ program put itself into background?

What's the best way for a running C or C++ program that's been launched from the command line to put itself into the background, equivalent to if the user had launched from the unix shell with '&' at the end of the command? (But the user didn't.) It's a GUI app and doesn't need any shell I/O, so there's no reason to tie up the shell after launch. But I want a shell command launch to be auto-backgrounded without the '&' (or on Windows).
Ideally, I want a solution that would work on any of Linux, OS X, and Windows. (Or separate solutions that I can select with #ifdef.) It's ok to assume that this should be done right at the beginning of execution, as opposed to somewhere in the middle.
One solution is to have the main program be a script that launches the real binary, carefully putting it into the background. But it seems unsatisfying to need these coupled shell/binary pairs.
Another solution is to immediately launch another executed version (with 'system' or CreateProcess), with the same command line arguments, but putting the child in the background and then having the parent exit. But this seems clunky compared to the process putting itself into background.
Edited after a few answers: Yes, a fork() (or system(), or CreateProcess on Windows) is one way to sort of do this, that I hinted at in my original question. But all of these solutions make a SECOND process that is backgrounded, and then terminate the original process. I was wondering if there was a way to put the EXISTING process into the background. One difference is that if the app was launched from a script that recorded its process id (perhaps for later killing or other purpose), the newly forked or created process will have a different id and so will not be controllable by any launching script, if you see what I'm getting at.
Edit #2:
fork() isn't a good solution for OS X, where the man page for 'fork' says that it's unsafe if certain frameworks or libraries are being used. I tried it, and my app complains loudly at runtime: "The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec()."
I was intrigued by daemon(), but when I tried it on OS X, it gave the same error message, so I assume that it's just a fancy wrapper for fork() and has the same restrictions.
Excuse the OS X centrism, it just happens to be the system in front of me at the moment. But I am indeed looking for a solution to all three platforms.
My advice: don't do this, at least not under Linux/UNIX.
GUI programs under Linux/UNIX traditionally do not auto-background themselves. While this may occasionally be annoying to newbies, it has a number of advantages:
Makes it easy to capture standard error in case of core dumps / other problems that need debugging.
Makes it easy for a shell script to run the program and wait until it's completed.
Makes it easy for a shell script to run the program in the background and get its process id:
gui-program &
pid=$!
# do something with $pid later, such as check if the program is still running
If your program forks itself, this behavior will break.
"Scriptability" is useful in so many unexpected circumstances, even with GUI programs, that I would hesitate to explicitly break these behaviors.
Windows is another story. AFAIK, Windows programs automatically run in the background--even when invoked from a command shell--unless they explicitly request access to the command window.
On Linux, daemon() is what you're looking for, if I understand you correctly.
The way it's typically done on Unix-like OSes is to fork() at the beginning and exit from the parent. This won't work on Windows, but is much more elegant than launching another process where forking exists.
Three things need doing,
fork
setsid
redirect STDIN, STDOUT and STDERR to /dev/null
This applies to POSIX systems (all the ones you mention claim to be POSIX (but Windows stops at the claiming bit))
On UNIX, you need to fork twice in a row and let the parent die.
A process cannot put itself into the background, because it isn't the one in charge of background vs. foreground. That would be the shell, which is waiting for process exit. If you launch a process with an ampersand "&" at the end, then the shell does not wait for process exit.
But the only way the process can escape the shell is to fork off another child and then let its original self exit back to the waiting shell.
From the shell, you can background a process with Control-Z, then type "bg".
Backgrounding a process is a shell function, not an OS function.
If you want an app to start in the background, the typical trick is to write a shell script to launch it that launches it in the background.
#! /bin/sh
/path/to/myGuiApplication &
To followup on your edited question:
I was wondering if there was a way to put the EXISTING process into the background.
In a Unix-like OS, there really is not a way to do this that I know of. The shell is blocked because it is executing one of the variants of a wait() call, waiting for the child process to exit. There is not a way for the child process to remain running but somehow cause the shell's wait() to return with a "please stop watching me" status. The reason you have the child fork and exit the original is so the shell will return from wait().
Here is some pseudocode for Linux/UNIX:
initialization_code()
if(failure) exit(1)
if( fork() > 0 ) exit(0)
setsid()
setup_signal_handlers()
for(fd=0; fd<NOFILE; fd++) close(fd)
open("/dev/null", O_RDONLY)
open("/dev/null", O_WRONLY)
open("/dev/null", o_WRONLY)
chdir("/")
And congratulations, your program continues as an independent "daemonized" process without a controlling TTY and without any standard input or output.
Now, in Windows you simply build your program as a Win32 application with WinMain() instead of main(), and it runs without a console automatically. If you want to run as a service, you'll have to look that up because I've never written one and I don't really know how they work.
You edited your question, but you may still be missing the point that your question is a syntax error of sorts -- if the process wasn't put in the background to begin with and you want the PID to stay the same, you can't ignore the fact that the program which started the process is waiting on that PID and that is pretty much the definition of being in the foreground.
I think you need to think about why you want to both put something in the background and keep the PID the same. I suggest you probably don't need both of those constraints.
As others mentioned, fork() is how to do it on *nix. You can get fork() on Windows by using MingW or Cygwin libraries. But those will require you to switch to using GCC as your compiler.
In pure Windows world, you'd use CreateProcess (or one of its derivatives CreateProcessAsUser, CreateProcessWithLogonW).
The simplest form of backgrounding is:
if (fork() != 0) exit(0);
In Unix, if you want to background an disassociate from the tty completely, you would do:
Close all descriptors which may access a tty (usually 0, 1, and 2).
if (fork() != 0) exit(0);
setpgroup(0,getpid()); /* Might be necessary to prevent a SIGHUP on shell exit. */
signal(SIGHUP,SIG_IGN); /* just in case, same as using nohup to launch program. */
fd=open("/dev/tty",O_RDWR);
ioctl(fd,TIOCNOTTY,0); /* Disassociates from the terminal */
close(fd);
if (fork() != 0) exit(0); /* just for good measure */
That should fully daemonize your program.
The most common way of doing this under Linux is via forking. The same should work on Mac, as for Windows I'm not 100% sure but I believe they have something similar.
Basically what happens is the process splits itself into two processes, and then the original one exits (returning control to the shell or whatever), and the second process continues to run in the background.
I'm not sure about Windows, but on UNIX-like systems, you can fork() then setsid() the forked process to move it into a new process group that is not connected to a terminal.
Under Windows, the closing thing you're going to get to fork() is loading your program as a Windows service, I think.
Here is a link to an intro article on Windows services...
CodeProject: Simple Windows Service Sample
So, as you say, just fork()ing will not do the trick. What you must do is fork() and then re-exec(), as this code sample does:
#include stdio.h>
#include <unistd.h>
#include <string.h>
#include <CoreFoundation/CoreFoundation.h>
int main(int argc, char **argv)
{
int i, j;
for (i=1; i<argc; i++)
if (strcmp(argv[i], "--daemon") == 0)
{
for (j = i+1; j<argc; j++)
argv[j-1] = argv[j];
argv[argc - 1] = NULL;
if (fork()) return 0;
execv(argv[0], argv);
return 0;
}
sleep(1);
CFRunLoopRun();
CFStringRef hello = CFSTR("Hello, world!");
printf("str: %s\n", CFStringGetCStringPtr(hello, CFStringGetFastestEncoding(hello)));
return 0;
}
The loop is to check for a --daemon argument, and if it is present, remove it before re-execing so an infinite loop is avoided.
I don't think this will work if the binary is put into the path because argv[0] is not necessarily a full path, so it will need to be modified.
/**Deamonize*/
pid_t pid;
pid = fork(); /**father makes a little deamon(son)*/
if(pid>0)
exit(0); /**father dies*/
while(1){
printf("Hello I'm your little deamon %d\n",pid); /**The child deamon goes on*/
sleep(1)
}
/** try 'nohup' in linux(usage: nohup <command> &) */
In Unix, I have learned to do that using fork().
If you want to put a running process into the background, fork it twice.
I was trying the solution.
Only one fork is needed from the parent process.
The most important point is that, after fork, the parent process must die by calling _exit(0); and NOT by calling exit(0);
When _exit(0); is used, the command prompt immediately returns on the shell.
This is the trick.
If you need a script to have the PID of the program, you can still get it after a fork.
When you fork, save the PID of the child in the parent process. When you exit the parent process, either output the PID to STD{OUT,ERR} or simply have a return pid; statement at the end of main(). A calling script can then get the pid of the program, although it requires a certain knowledge of how the program works.

Resources