fork() and wait() connection to pid - fork

I know that fork() creates a child process, returns 0 to child and returns child's pid to parent.
From what I understand wait() also returns some kind of pid of the child process that's terminated. Is this the same pid as the one that's returned to parent after fork?
I don't understand how to use wait().
My textbook just shows
int ReturnCode;
while (pid!=wait(&ReturnCode));
/*the child has terminated with Returncode as its return code*/
I don't even understand what this means.
How do I use wait()? I am using execv to create a child process but I want parent to wait. Someone please explain and give an example.
Thanks

wait() does indeed return the PID of the child process that died. If you only have one child process, you don't really need to check the PID (do check that it's not zero or negative though; there are some conditions that may cause the wait call to fail). You can find an example here: http://www.csl.mtu.edu/cs4411/www/NOTES/process/fork/wait.html

wait() takes the address of an integer
variable and returns the process ID of
the completed process.
More about the wait() system call
The
while (pid!=wait(&ReturnCode));
loop is comparing the process id (pid) returned by wait() to the pid received earlier from a fork or any other process starter. If it finds out that the process that has ended IS NOT the same as the one this parent process has been waiting for, it keeps on wait()ing.

Related

How to check if a process started in the background still running?

It looks like if you create a subprocess via exec.Cmd and Start() it, the Cmd.Process field is populated right away, however Cmd.ProcessState field remains nil until the process exits.
// ProcessState contains information about an exited process,
// available after a call to Wait or Run.
ProcessState *os.ProcessState
So it looks like I can't actually check the status of a process I Start()ed while it's still running?
It makes no sense to me ProcessState is set when the process exits. There's an ProcessState.Exited() method which will always return true in this case.
So I tried to go this route instead: cmd.Process.Pid field exists right after I cmd.Start(), however it looks like os.Process doesn't expose any mechanisms to check if the process is running.
os.FindProcess says:
On Unix systems, FindProcess always succeeds and returns a Process for the given pid, regardless of whether the process exists.
which isn't useful –and it seems like there's no way to go from os.Process to an os.ProcessState unless you .Wait() which defeats the whole purpose (I want to know if the process is running or not before it has exited).
I think you have two reasonable options here:
Spin off a goroutine that waits for the process to exit. When the wait is done, you know the process exited. (Positive: pretty easy to code correctly; negative: you dedicate an OS thread to waiting.)
Use syscall.Wait4() on the published Pid. A Wait4 with syscall.WNOHANG set returns immediately, filling in the status.
It might be nice if there were an exported os or cmd function that did the Wait4 for you and filled in the ProcessState. You could supply WNOHANG or not, as you see fit. But there isn't.
The point of ProcessState.Exited() is to distinguish between all the various possibilities, including:
process exited normally (with a status byte)
process died due to receiving an unhandled signal
See the stringer for ProcessState. Note that there are more possibilities than these two ... only there seems to be no way to get the others into a ProcessState. The only calls to syscall.Wait seem to be:
syscall/exec_unix.go: after a failed exec, to collect zombies before returning an error; and
os/exec_unix.go: after a call to p.blockUntilWaitable().
If it were not for the blockUntilWaitable, the exec_unix.go implementation variant for wait() could call syscall.Wait4 with syscall.WNOHANG, but blockUntilWaitable itself ensures that this is pointless (and the goal of this particular wait is to wait for exit anyway).

Best way to wait for all child processes to complete in Ruby?

Looking for a way to wait for the completion of all child processes, I found this code:
while true
p "waiting for child processes"
begin
exited_pid = Process.waitpid(-1,Process::WNOHANG)
if exited_pid and exited_pid > 0 then
p "Process exited : #{exited_pid} with status #{$?.exitstatus }"
end
sleep 5
rescue SystemCallError
puts "All children collected!"
break
end
end
This looks like it works in a similar way to Unix-systems process management, as I read on tutorialspoint HERE.
So in summary, it looks like this code:
Calls Process.waitpid, for any child process that exists. If no child process has exited, continue anyway.
If a child process has exited, then notify the user. Otherwise sleep, and check again.
When all child processes have exited an error is thrown, which is caught and the user is notified that processes are complete.
But looking at a similar question on waiting for child processes in C (Make parent wait for all child processes), which has as an answer:
POSIX defines a function: wait(NULL);. It's shorthand for waitpid(-1,
NULL, 0);, which will block until all children processes exit.
I tested that Process.wait() in Ruby achieves pretty much the same thing as the more verbose code above.
What is the benefit of the more verbose code above? Or, which is considered a better approach to waiting for child processes? It seems in the verbose code that I would be able to wait for specific processes and listen for specific exit codes. But if I don't need to do this is there any benefit?
Also, regarding the more verbose code:
Why does the call to Process.waitpid() throw an error if there are no more child processes?
If more than 1 child process exists within the 5 second sleep period, it seems like there is a queue of completed processes and that Process.waitpid just returns the top member of the queue. What is actually happening here?

What does fork() returns to its parent when it is called in a child process of another process?

I guess it should not be zero.
EDIT: It is zero.
The PID of the child process is returned in the parent and 0 in the child upon success and -1 upon failure
A Fork() call always returns pid(process id) of the created child to its parent.
pid_t fork(void);

Communicating between Ruby processes, loops

I have a Ruby application which must run 24/7 to process information for a web API, both of which are operating on Google Compute Engine on a Debian Instance - the API is served by Sinatra. When I run this script in loop, it uses up the 1-core vCPU. Using a message queuing system like RabbitMQ to pass messages from the API to the backend script seems to me to skip a learning opportunity for communicating between Ruby scripts natively.
How do I keep a script dormant, i.e. awaiting instruction but not consuming memory 99% CPU? I'm assuming it's not going to be in an infinite loop, but I'm stumped on this.
How would it be best to communicate this message from one script to another? I read about Kernel#Select and forking of subprocesses, but I haven't encountered any definitive or comprehensible solution.
Forking may indeed be a good solution for you, and you only need to understand three system calls to make good use of it: fork(), waitpid() and exec(). I'm not a Ruby guy, so hopefully my C-like explanation will make enough sense for you to fill in the blanks.
The way fork() works is by the operating system making a byte-for-byte copy of the calling process' virtual memory space as it was when fork() was called and carving out new memory to place the copy into. This creates a new process with its parent's exact state--except for that the child process' fork() call returns 0, while the parent's returns the PID of the new child process. This allows the child process to know that it is a child, and the parent process to know who its children are.
While fork() copies its caller's process image, the exec() system call replaces its caller's process image with a brand new one, as specified by its arguments.
The waitpid() system call is used by the parent process to wait for a return value from a specific child process (one whose process ID was returned to the parent by the fork() call), and then properly log the process' completion with the OS. Even if you don't need your child process' return value, you should call waitpid() on it anyway so you don't end up accumulating "zombie processes."
Again, I'm not a Ruby guy, so hopefully my C-like pseudocode makes sense. Consider the following server:
while(1) { # an infinite loop
# Wait for and accept connections from your web API.
pid = fork(); # fork() returns a process ID number
# If fork() returns a negative number, something went wrong.
if(pid < 0) {
exit(1);
}
# If fork() returns 0, this is the child process.
else if(pid == 0) {
# Remember that because fork() copies your program's state,
# you can use variables you assigned before the fork to
# send to the new process as arguments.
exec(./processingscript.rb, "processingscript.rb", arg1, arg2, arg3, ...);
}
# If fork() returns a number greater than 0 (the PID of the forked
# child process), this is the parent process.
else if(pid > 0) {
childreturnvalue = waitpid(pid); # parent process hangs here until
# the process with the ID number
# pid returns.
}
}
Written this way, your CPU-intenive script only runs when a connection is received from the web API. It does its processing and then terminates, waiting to be called again. You can also specify "no hang" options for waitpid() so that you can fork multiple instances of your processing script concurrently without having your server hang every time it needs to wait for an instance of that script to complete.
Hope this helps! Perhaps somebody who knows Ruby can edit this to be a bit more idiomatic to the language.

How can I get the PID of a new process before it executes?

So that I can do some injecting and interposing using the inject_and_interpose code, I need to way to get the PID of a newly-launched process (a typical closed-source user application) before it actually executes.
To be clear, I need to do better than just "notice it quickly"--I can't be polling, or receiving some asynchronous notification that means that the process has already been executing for a few milliseconds by the time I take action.
I need to have a chance to do my injecting and interposing before a single statement executes.
I'm open to writing a background process that gets synchronously notified when a process by a particular name comes into existence. I'm also open to writing a launcher application that in turn fires up the target application.
Any solution needs to support 64-bit code, at a minimum, under 10.5 (Leopard) through 10.8 (Mountain Lion).
In case this proves to be painfully simple, I'll go ahead and admit that I'm new to OS X :) Thanks!
I know how to do this on Linux, so maybe it would be the same(-ish) on OSX.
You first call fork() to duplicate your process. The return value of fork() indicates whether you are the parent or child. The parent gets the pid of the child process, and the child gets zero.
So then, the child calls exec() to actually begin executing the new executable. With the use of a pipe created before the call to fork, the child could wait on the parent to do whatever it needed before execing the new execuatable.
pid_t pid = fork();
if (pid == -1) {
perror("fork");
exit(1);
}
if (pid > 0) {
// I am the parent, and pid is the PID of the child process.
//TODO: If desired, somehow notify child to proceed with exec
}
else {
// I am the child.
//TODO: If desired, wait no notification from parent to continue
execl("path/to/executable", "executable", "arg1", NULL);
// Should never get here.
fprintf(stderr, "ERROR: execl failed!\n");
}

Resources