Is there a way to get the call stack of the process to be killed by kill -9 automatically on Redhat/Centos? bpftrace? - linux-kernel

I just want to get the call stack or a core dump of the process automatically before a kill -9 sending to it.
After checking bpftrace docs, I wrote a simple bpftrace program:
[root]# cat killstack.bt
#!/usr/bin/env bpftrace
tracepoint:syscalls:sys_enter_kill
{
printf("%-6d -> %-6d, sig:%d\n", pid, args->pid, args->sig);
if(args->sig==9){system("pstack %d\n", args->pid);}
}
[root]# bpftrace killstack.bt --unsafe
Then start up a program to be killed:
[root]# sleep 1000 &
[1] 1822639
[root]# kill -9 1822639
The result is that bpftrace complains it could NOT find the process above.
1804194 -\> 1822639, sig:9
Process 1822639 not found.
Two questions:
Why it failed? I think it must be related to the sequence that event happens. I thought they executing like this. Am I wrong?
kill -9 1822639 -\> sys_enter_kill -\> pstack target process(sleep 1000) -\> sys_exit_kill -\> done "kill -9 1822639".
If bpftrace couldn't do that, any other way?

Related

How do I kill background processes / jobs started by a bash script after it finishes executing?

So I want to start a docker image, then a Django back-end and finally an angular front-end, let them run as long as I need to do tests/develop and then kill them when I'm done. To do this I first tried starting them all in a script and have them run in a background, and have a second script do kill %n for both processes. This doesn't work because the background processes are in another context, so the second script cannot reference them.
Then I tried this:
#!/bin/bash
# Exit Angular, Django and kill docker_img
function clean_up()
{
echo "Exiting..."
kill %2
kill %1
docker stop docker_img
reset
exit
}
# Trigger cleanup on CTRL + C
trap clean_up SIGINT
# Start docker database
docker start docker_img
# Start django backend
cd ~/Projects/DjangoBackend
source venv/bin/activate
python src/manage.py runserver &
sleep 3
echo 'Done starting django, starting angular'
sleep 1
# Start angular front end
cd ~/Projects/AngularFront
npm start &
However, after npm start & runs, the trap stops working, so it effectively becomes useless. I'm guessing it could be because once my script is done running the trap is no longer active, but I don't know how to fix this. What can I do?
If you are looking to kill a process in unix/linux, one way of doing it is you can record their PID in a file using ps -ef command.
And then use kill -9 to kill the process.
Example:
$ ps -ef | grep <process_name> | awk -F ' ' '{print $2}' > pid.txt
$ kill -9 `cat pid.txt`
ps -ef command will give all the running processes, using grep and process name, you can get PID of the particular process
awk is used to extract only PID from above command
kill -9 will forcefully kill the process
The answer seems to have been pretty easy, all I had to do was add wait to the end of the script, which allows the script to wait until the processes are done executing. Since two of the processes are servers, they don't stop unless prompted, so it'll just wait until SIGINT is received, at that point it'll run the clean_up function and exit gracefully.
Additionally, one could use the same trap but with the EXIT trigger instead of SIGINT to clean up when the script exits on it's own due to the processes closing.

Trying to close all child processes when I interrupt my bash script

I have written a bash script to carry out some tests on my system. The tests run in the background and in parallel. The tests can take a long time and sometimes I may wish to abort the tests part way through.
If I Control+C then it aborts the parent script, but leaves the various children running. I wish to make it so that I can hit Control+C or otherwise to quit and then kill all child processes running in the background. I have a bit of code that does the job if I'm running running the background jobs directly from the terminal, but it doesn't work in my script.
I have a minimal working example.
I have tried using trap in combination with pgrep -P $$.
#!/bin/bash
trap 'kill -n 2 $(pgrep -P $$)' 2
sleep 10 &
wait
I was hoping that on hitting control+c (SIGINT) would kill everything that the script started but it actually says:
./breakTest.sh: line 1: kill: (3220) - No such process
This number changes, but doesn't seem to apply to any running processes, so I don't know where it is coming from.
I guess if the contents of the trap command get evaluated where the trap command occurs then it might explain the outcome. The 3220 pid might be for pgrep itself.
I'd appreciate some insight here
Thanks
I have found a solution using pkill. This example also deals with many child processes.
#!/bin/bash
trap 'pkill -P $$' SIGINT SIGTERM
for i in {1..10}; do
sleep 10 &
done
wait
This appears to kill all the child processes elegantly. Though I don't properly understand what the issue was with my original code, apart from sending the correct signal.
in bash whenever you you use & after a command it places that command as a background job ( this background jobs are called job_spec ) which is incremented by one until you exit that terminal session. You can use the jobs command to get the list of the background jobs running. To work with this jobs you have to use the % with the job id. The jobs command also accept other options such as jobs -p to see the proces sids of all jobs , jobs -p %JOB_SPEC to see the process of id of that particular job.
#!/usr/bin/env bash
trap 'kill -9 %1' 2
sleep 10 &
wait
or
#!/usr/bin/env bash
trap 'kill -9 $(jobs -p %1)' 2
sleep 10 &
wait
I implemented something like this few years back, you can take a look at it async bash
You can try something like the following:
pkill -TERM -P <your_parent_id_here>

send signals to process using its pid

Hell all,
I am trying to write a shell script to run a program and send a sequence of signal with delay between them. I wrote the following code.
#!/bin/sh
KNOCK="KNOCK"
export KNOCK
./knock &
knockPID=$!
kill -SIGUSR2 $knockPID
kill -SIGUSR2 $knockPID
kill -SIGUSR1 $knockPID
sleep 2s;
kill -SIGUSR1 $knockPID
kill -SIGUSR2 $knockPID
I keep getting the following error for each of the kill commands
kill: Illegal option -S
your help is appreciated.
Generally, named signal arguments for the kill command are "recognized in a case-independent fashion, without the SIG prefix". So, you want:
kill -USR1 $knockPID
and so on.
kill -s SIGUSR2 $knockPID
should probably work on all modern OSes.
I ran into this problem too, I fixed it by using a different shell:
#!/bin/bash

Unable to kill nohup process

If I start the script by ./test.sh &, I am able to kill using kill -SIGINT PID.
But if I start my shell script using nohup ./test.sh & I am unable to kill the process using kill -SIGINT PID.
Kindly need your advice to kill the script using kill -SIGINT PID
The SIGINT signal means interrupt from keyboard; that's why it terminates a script run in foreground, but not in background neither using nohup.
To properly terminate your process use kill -TERM PID, which works in the 3 cases.

How to signal orphaned background process?

I am executing a shell script in background from my tcl script. The tcl script ends execution after some time. At this point I assume the background shell script becomes orphan and is adopted by init.
set res [catch { exec sudo $script &}]
Now the problem is I am not able to signal my (orphaned) background script. But why? Ok it now belongs to init but why can't I signal it. Only sigkill seems to work and that kills it - I need to trigger the signal handler I've written to handle SIGUSR2
trap 'process' SIGUSR2
Why can't I signal my orphan background process? Is there no way this can be done? Or is there some workaround?
EDIT: Seems to work fine when the sleep is not involved. See sample code below:
trap 'kill `cat /var/run/sleep.pid`; foo' SIGUSR2;
foo(){ echo test; }
while true; do
echo -n .
sleep 100 &
echo ${!} > /var/run/sleep.pid
wait ${!}
done
Works fine when not orphaned - but in the case of orphan process I think the problem is the true pid of sleep gets overwritten and I'm not able to kill it when the trap arrives.
lets run a small script like that:
bash -c '(trap foo SIGUSR2;foo(){ echo test; };while true; do echo -n .;sleep 1;done) & echo $!'; read
It will fork a background process which just runs and outputs some dots. It will also output the PID of the process, which you can use to check and signal it.
$ ps -f 19489
UID PID PPID C STIME TTY STAT TIME CMD
michas 19489 1 0 23:45 pts/8 S 0:00 bash -c (trap foo SIGUS...
Because the forking shell died directly after running the command in background, the process is now owned by init (PPID=1).
Now you can signal the process to call the handler:
kill -USR2 19489
If you do, you will notice the "test" output at the terminal printing the dots.
There should be no difference, whether you start a background process from shell or tcl. If it runs you can send it a signal and if there is a handler, it will be called.
If it really does not answer to signals it might be blocked, waiting for something. For example in a sleep or waiting for some IO.

Resources