fork, execlp, kill and waitpid with fbi but still ending up with zombies - fork

I'm running Pi OS Bullseye on a Pi4 and I'm trying to do the following:
Run a systemd service
Have that service display images (child process C1 exec fbi)
On demand, display next images (child process C2 exec fbi) and tidy up
To this end, the parent process (P) and child attempt this:
P: fork a child (C2)
C2: exec fbi
P: kill and waitpid the previous fbi instance of C1
I get a left-over process and a zombie is killed then new one created every time I run the sequence - it looks like fbi itself forks another child (C3).
My ps output looks like this:
PID PPID PGRP USER NI S COMMAND
5706 1 5706 root -5 S fbi
5705 5499 5499 root -5 Z fbi
5704 1 5704 root -5 S fbi
5702 1 5702 root -5 S fbi
5700 1 5700 root -5 S fbi
5698 1 5698 root -5 S fbi
5696 1 5696 root -5 S fbi
5694 1 5694 root -5 S fbi
If I manually kill an old instance of fbi (e.g. 5694 in the above trace) they all die, which makes me think this is a peculiarity with fbi.
Also, my app reported 5705 as the forked process - what is 5706?
And how is PID 1 launching fbi (PID 5706) instead of my app (PID 5499)?
I could ignore the zombie as there is only a single instance. If there are multiple new image requests (and another display) the number of running fbi instances will mount up.
I've also tried using Process Groups (setgid, killpg) with the same outcome.
Can someone please explain what I'm doing wrong? This is beginning to hurt.
My code (display_front_img is called - yes, there will be more displays) -
void show_image(char *fb_dev_file, char *img_path, pid_t forked_pid, pid_t last_pid)
{
if(forked_pid == 0) // child executes fbi
{
execlp("/usr/bin/fbi", "fbi", "-d", fb_dev_file, "-T", "1", "--noverbose", img_path, NULL);
}
// parent cleans up
else if(last_pid != 0) // if previous fbi called
{
fprintf(stderr, "kill %d returned %d\n", last_pid, kill(last_pid, SIGTERM));
fprintf(stderr, "wait %d returned %d\n", last_pid, waitpid(last_pid, NULL, 0));
}
}
void display_front_img(char * img_path)
{
pid_t last_pid = front_fbi_pid; // get previous assigned PID
front_fbi_pid = fork(); // get newly assigned PID or 0
show_image("/dev/fb0", img_path, front_fbi_pid, last_pid);
}

Related

Does MPI_Scatter influence MPI_Bcast?

I'm sending an integer that triggers termination via MPI_Bcast. The root sets a variable called "running" to zero and sends the BCast. The Bcast seems to complete but I can't see that the value is sent to the other processes. The other processes seem to be waiting for an MPI_Scatter to complete. They shouldn't even be able to arrive here.
I have done much research on MPI_Bcast and from what I understand it should be blocking. This is confusing me since the MPI_Bcast from the root seems to complete even though I can't find the matching (receiving) MPI_Bcasts for the other processes. I have surrounded all of my MPI_Bcasts with printfs and the output of those printfs 1) print and 2) print the correct values from the root.
The root looks as follows:
while (running || ...) {
/*Do stuff*/
if (...) {
running = 0;
printf("Running = %d and Bcast from root\n", running);
MPI_Bcast(&running, 1, MPI_INT, 0, MPI_COMM_WORLD);
printf("Root 0 Bcast complete. Running %d\n", running);
/* Do some more stuff and eventually reach Finalize */
printf("Root is Finalizing\n");
MPI_Finalize();
}
}
The other processes have the following code:
while (running) {
doThisFunction(rank);
printf("Waiting on BCast from root with myRank: %d\n", rank);
MPI_Bcast(&running, 1, MPI_INT, 0, MPI_COMM_WORLD);
printf("P%d received running = %d\n", rank, running);
if (running == 0) { // just to make sure.
break;
}
}
MPI_Finalize();
I also have the following in the function "doThisFunction()". This is where the processes seem to be waiting for process 0:
int doThisFunction(...) {
/*Do stuff*/
printf("P%d waiting on Scatter\n", rank);
MPI_Scatter(buffer, 130, MPI_BYTE, encoded, 130, MPI_BYTE, 0, MPI_COMM_WORLD);
printf("P%d done with Scatter\n", rank);
/*Do stuff*/
printf("P%d waiting on gather\n", rank);
MPI_Gather(encoded, 1, MPI_INT, buffer, 1, MPI_INT, 0, MPI_COMM_WORLD);
printf("P%d done with gater\n", rank);
/*Do Stuff*/
return aValue;
}
The output in the command line looks as follows:
P0 waiting on Scatter
P0 done with Scatter
P0 waiting on gather
P0 done with gather
Waiting on BCast from root with myRank: 1
P1 received running = 1
P1 waiting on Scatter
P0 waiting on Scatter
P0 done with Scatter
P0 waiting on gather
P0 done with gather
P1 done with Scatter
P1 waiting on gather
P1 done with gather
Waiting on BCast from root with myRank: 1
P1 received running = 1
P1 waiting on Scatter
Running = 0 and Bcast from root
Root 0 Bcast complete. Running 0
/* Why does it say the Bcast is complete
/* even though P1 didn't output that it received it?
Root is Finalizing
/* Deadlocked...
I'm expecting that P1 receives running as zero and then goes into MPI_Finalize() but rather it gets stuck at the scatter which will not be accessed by the root which is already trying to finalize.
In actuality, the program is in deadlock and won't terminate MPI.
I doubt that the problem is that the scatter is accepting the Bcast value because this doesn't even make sense since the root doesn't call scatter.
Does anyone please have any tips on how to resolve this problem?
Your help is greatly appreciated.
Why does it say the Bcast is complete even though P1 didn't output that it received it?
Note the following definitions from the MPI Standard:
Collective operations can (but are not required to) complete as soon as the caller's participation in the collective communication is finished. ... The completion of a collective operation indicates that the caller is free to modify locations in the communication buffer. It does not indicate that other processes in the group have completed or even started the operation (unless otherwise implied by the description of the operation). Thus, a collective communication operation may, or may not, have the effect of synchronizing all calling processes. This statement excludes, of course, the barrier operation.
According to this definition, your MPI_Bcast on the root process can finish even if there is no MPI_Bcast called by slaves.
(For point-to-point operations, we have different communication modes, such as the synchronous one, to address these issues. Unfortunately, there is no synchronous mode for collectives.)
There seems to be some problem in your code with the order of operations. The root called MPI_Bcast, but process #1 did not and was waiting on MPI_Scatter as your log output indicates.

how the fork() function works?

can some one explain this code ?
int main ( ){
int i=0 ;
while (fork() !=0 && i<2)
i=i+1;
printf(" this is the process %d and ends with i=%d \n", getpid(), i);
return 0;
}
what I have understand that a process father has 3 children !
but according to this execution output I am not sure that I have understood the fork function :
[root#www Desktop]# ./prog1
this is the process 8025 and ends with i=2
[root#www Desktop]# this is the process 8027 and ends with i=1
this is the process 8028 and ends with i=2
this is the process 8026 and ends with i=0
Thank You !
Remember fork() forks your process, resulting in two more-or-less identical processes, the difference in each is the return value of fork() is 0 for the child, and the child's pid for the parent.
Your while loop only iterates for the parent processes (it ends for the child processes since in those processes the return value of fork() is 0). So the first time through (i==0), the child process falls through, prints its pid, and exits. The parent remains.
The parent increments i, forks again, the child (i==1) falls through, prints its pid and exits. So that's one exit with i==0 and one exit with i==1.
The parent increments i, forks again, but i is now 2, so the while loop exits for both parent and child processes. Both exit with i==2. So in total that's one exit i==0, one exit with i==1 and two exits with i==2.
A couple of other points to bear in mind:
processes are not guaranteed to be sequential, so the output may be out-of-(expected)-order (as it is in your example)
an optimising compiler may also mess with sequencing. Compiling with -O0 may make the output (sequence) more what you expect.
$ gcc -w -O0 forktest.c && ./a.out
this is the process 5028 and ends with i=0
this is the process 5029 and ends with i=1
this is the process 5030 and ends with i=2
this is the process 5027 and ends with i=2

Child process won't suicide if parent dies

I have a subprocess (running on MacOS) that I want to kill itself if the parent quits, exits, terminates, is killed or crashes. Having followed the advice from How to make child process die after parent exits? I can't get it to quietly kill itself if the parent program crashes. It will go to 100% CPU until I manually kill it.
Here are the key points of the code:
int main(int argc, char *argv[])
{
// Catch signals
signal(SIGINT, interruptHandler);
signal(SIGABRT, interruptHandler);
signal(SIGTERM, interruptHandler);
signal(SIGPIPE, interruptHandler);
// Create kqueue event filter
int kqueue_fd = kqueue();
struct kevent kev, recv_kev;
EV_SET(&kev, parent_pid, EVFILT_PROC, EV_ADD|EV_ENABLE, NOTE_EXIT, 0, NULL);
kevent(kqueue_fd, &kev, 1, NULL, 0, NULL);
struct pollfd kqpoll;
kqpoll.fd = kqueue_fd;
kqpoll.events = POLLIN;
// Start a run loop
while(processEvents())
{
if(kill(parent_pid, 0) == -1)
if(errno == ESRCH)
break;
if(poll(&kqpoll, 1, 0) == 1)
if(kevent(kqueue_fd, NULL, 0, &recv_kev, 1, NULL))
break;
parent_pid = getppid();
if(parent_pid == 1)
break;
sleep(a_short_time);
// (simple code here causes subprocess to sleep longer if it hasn't
// received any events recently)
}
}
Answering my own question here:
The reason for this problem was not down to detecting whether the parent process had died. In processEvents() I was polling the pipe from the parent process to see if there was any communication. When the parent died, poll() returned a value of 1 and the read loop thought there was infinite data waiting to be read.
The solution was to detect whether the pipe had been disconnected or not.

Emulating pipes

I've just recently learned about pipes and I would like to emulate the "|" gimmick provided by shells.
In the code below, the parent process spawns 2 child processes, after which they do their piping and get replaced by ls and grep. While that happens the parent process waits patiently. The problem is that the child processes never finish although they manage to send some data though the pipe and onto the screen.
There are other posts regarding pipes on SO, but I've never seen the setup in which the parent process launches 2 children. I've only seen the parent communicating with one child.
int p0[2];
pipe(p0); //creating pipe
if(fork() == 0) { //child 1
dup2(p0[0], STDIN_FILENO);
close(p0[0]); close(p0[1]);
execlp("grep","grep","a",NULL);
}
else { //parent
if(fork() == 0) { //child 2
dup2(p0[1], STDOUT_FILENO);
close(p0[0]); close(p0[1]);
execlp("ls","ls",NULL);
}
else { //parent
wait(NULL);
wait(NULL); //waiting for c1 and c2
close(p0[0]); close(p0[1]);
printf("parent exit\n");
}
}
My questions are: Why don't the child processes finish? Is fork-pipe structure sound or am I doing it completely wrong?
Close the pipe before starting to wait in the last section.

After suspending child process with SIGTSTP, shell not responding

I'm coding a basic shell in C, and I'm working on suspending a child process right now.
I think my signal handler is correct, and my child process is suspending, but after that, the terminal should return to the parent process and that's not happening.
The child is suspended, but my shell isn't registering any input or output anymore. tcsetpgrp() doesn't seem to be helping.
Here's my signal handler in my shell code for SIGTSTP:
void suspend(int sig) {
pid_t pid;
sigset_t mask;
//mpid is the pgid of this shell.
tcsetpgrp(STDIN_FILENO, mpid);
tcsetpgrp(STDOUT_FILENO, mpid);
sigemptyset(&mask);
sigaddset(&mask, SIGTSTP);
sigprocmask(SIG_UNBLOCK, &mask, NULL);
signal(SIGTSTP, SIG_DFL);
//active.pid is the pid of the child currently in the fg.
if (active.pid != 0) {
kill(active.pid, SIGTSTP);
}
else{
//if this code is being run in the child, child calls SIGTSTP on itself.
pid = getpid();
if (pid != 0 && pid != mpid){
kill(pid, SIGTSTP);
}
}
signal(SIGTSTP, suspend);
}
Can anyone tell me what I'm doing wrong?
Am I suspending my shell along with the child, and do I need to return stdin and stdout to the shell somehow? How would I do this?
Thanks!
It's an old question but still I think I found an answer.
You didn't write your parent's code but I'm assuming its looks something like:
int main(){
pid_t pid = fork();
if(pid == 0) //child process
//call some program
else //parent process
wait(&status); //or waitpid(pid, &status, 0)
//continue with the program
}
the problem is with the wait() or waitpid(), it's look like if you run your program on OS like Ubuntu after using Ctrl+Z your child process is getting the SIGTSTP but the wait() function in the parent process is still waiting!
The right way of doing that is to replace the wait() in the parent with pause(), and make another handler that catch SIGCHLD. For example:
void sigHandler(int signum){
switch(signum){
case SIGCHLD:
// note that the last argument is important for the wait to work
waitpid(-1, &status, WNOHANG);
break;
}
}
In this case after the child process receive Ctrl+Z the parent process also receive SIGCHLD and the pause() return.
tcsetpgrp is to specify what is the foreground job. When your shell spawns a job in foreground (without &), it should create a new process group and make that the foreground job (of the controlling terminal, not whatever's on STDIN). Then, upon pressing CTRL-Z, that job will get the TSTP. It's the terminal that suspends the job, not your shell. Your shell shouldn't trap TSTP or send TSTP to anyone.
It should just wait() for the job it has spawned and detect when it has been stopped (and claim back the foreground group and mark the job as suspended internally). Your fg command would make the job's pgid the foreground process group again and send a SIGCONT to it and wait for it again, while bg would just send the SIGCONT
i used folk with signals for make process pause and resume with ctrl+c
video while is running : link
Code:
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
void reverse_handler(int sig);
_Bool isPause=0;
_Bool isRunning=1;
int main()
{
int ppid;
int counter=0;
//make parent respond for ctrl+c (pause,resume).
signal(SIGINT,reverse_handler);
while(isRunning){
while(isPause==0)
{
/*code exec while process is resuming */
printf("\nc:%d",counter++);
fflush(stdout);
sleep(1);
}
//close parent after child is alive.
if((ppid=fork())==0){ exit(0); }
//make child respond for ctrl+c (pause,resume).
signal(SIGINT,reverse_handler);
//keep child alive and listening.
while(isPause==1){ /*code exec while process is pausing */ sleep(1); }
}
return 0;
}
//if process is pause made it resume and vice versa.
void reverse_handler(int sig){
if(isPause==0){
printf("\nPaused");
fflush(stdout);
isPause=1;
}
else if(isPause==1){
printf("\nresuming");
fflush(stdout);
isPause=0;
}
}
i hope that's be useful.
please comment me if there's any questions
I might be late to answer the question here but this is what worked when I was stuck with the same problem. According to the man pages for tcsetpgrp()
The function tcsetpgrp() makes the process group with process group ID
pgrp the foreground process group on the terminal associated to fd,
which must be the controlling terminal of the calling process, and
still be associated with its session. Moreover, pgrp must be a
(nonempty) process group belonging to the same session as the calling
process.
If tcsetpgrp() is called by a member of a background process group in
its session, and the calling process is not blocking or ignoring
SIGTTOU, a SIGTTOU signal is sent to all members of this background
process group.
So, what worked for me was ignoring the signal SIGTTOU in the shell program, before I created the processes that would come to the foreground. If I do not ignore this signal, then the kernel will send this signal to my shell program and suspend it.

Resources