In what order should I send signals to gracefully shutdown processes?

In what order should I send signals to gracefully shutdown processes? - bash

In a comment on this answer of another question, the commenter says:
don’t use kill -9 unless absolutely
necessary! SIGKILL can’t be trapped so
the killed program can’t run any
shutdown routines to e.g. erase
temporary files. First try HUP (1),
then INT (2), then QUIT (3)
I agree in principle about SIGKILL, but the rest is news to me. Given that the default signal sent by kill is SIGTERM, I would expect it is the most-commonly expected signal for graceful shutdown of an arbitrary process. Also, I have seen SIGHUP used for non-terminating reasons, such as telling a daemon "re-read your config file." And it seems to me that SIGINT (the same interrupt you'd typically get with Ctrl-C, right?) isn't as widely supported as it ought to be, or terminates rather ungracefully.
Given that SIGKILL is a last resort — Which signals, and in what order, should you send to an arbitrary process, in order to shut it down as gracefully as possible?
Please substantiate your answers with supporting facts (beyond personal preference or opinion) or references, if you can.
Note: I am particularly interested in best practices that include consideration of bash/Cygwin.
Edit: So far, nobody seems to mention INT or QUIT, and there's limited mention of HUP. Is there any reason to include these in an orderly process-killing?

SIGTERM tells an application to terminate. The other signals tell the application other things which are unrelated to shutdown but may sometimes have the same result. Don't use those. If you want an application to shut down, tell it to. Don't give it misleading signals.
Some people believe the smart standard way of terminating a process is by sending it a slew of signals, such as HUP, INT, TERM and finally KILL. This is ridiculous. The right signal for termination is SIGTERM and if SIGTERM doesn't terminate the process instantly, as you might prefer, it's because the application has chosen to handle the signal. Which means it has a very good reason to not terminate immediately: It's got cleanup work to do. If you interrupt that cleanup work with other signals, there's no telling what data from memory it hasn't yet saved to disk, what client applications are left hanging or whether you're interrupting it "mid-sentence" which is effectively data corruption.
For more information on what the real meaning of the signals is, see sigaction(2). Don't confuse "Default Action" with "Description", they are not the same thing.
SIGINT is used to signal an interactive "keyboard interrupt" of the process. Some programs may handle the situation in a special way for the purpose of terminal users.
SIGHUP is used to signal that the terminal has disappeared and is no longer looking at the process. That is all. Some processes choose to shut down in response, generally because their operation makes no sense without a terminal, some choose to do other things such as recheck configuration files.
SIGKILL is used to forcefully remove the process from the kernel. It is special in the sense that it's not actually a signal to the process but rather gets interpreted by the kernel directly.
Don't send SIGKILL. - SIGKILL should certainly never be sent by scripts. If the application handles the SIGTERM, it can take it a second to cleanup, it can take a minute, it can take an hour. Depending on what the application has to get done before it's ready to end. Any logic that "assumes" an application's cleanup sequence has taken long enough and needs to be shortcut or SIGKILLed after X seconds is just plain wrong.
The only reason why an application would need a SIGKILL to terminate, is if something bugged out during its cleanup sequence. In which case you can open a terminal and SIGKILL it manually. Aside from that, the only one other reason why you'd SIGKILL something is because you WANT to prevent it from cleaning itself up.
Even though half the world blindly sends SIGKILL after 5 seconds it's still horribly wrong thing to do.

Short Answer: Send SIGTERM, 30 seconds later, SIGKILL. That is, send SIGTERM, wait a bit (it may vary from program to program, you may know your system better, but 5 to 30 seconds is enough. When shutting down a machine, you may see it automatically waiting up to 1'30s. Why the hurry, after all?), then send SIGKILL.
Reasonable Answer: SIGTERM, SIGINT, SIGKILL
This is more than enough. The process will very probably terminate before SIGKILL.
Long Answer: SIGTERM, SIGINT, SIGQUIT, SIGABRT, SIGKILL
This is unnecessary, but at least you are not misleading the process regarding your message. All these signals do mean you want the process to stop what it is doing and exit.
No matter what answer you choose from this explanation, keep that in mind!
If you send a signal that means something else, the process may handle it in very different ways (on one hand). On the other hand, if the process doesn't handle the signal, it doesn't matter what you send after all, the process will quit anyway (when the default action is to terminate, of course).
So, you must think as yourself as a programmer. Would you code a function handler for, lets say, SIGHUP to quit a program that connects with something, or would you loop it to try to connect again? That is the main question here! That is why it is important to just send signals that mean what you intend.
Almost Stupid Long Answer:
The table bellow contains the relevant signals, and the default actions in case the program does not handle them.
I ordered them in the order I suggest to use (BTW, I suggest you to use the reasonable answer, not this one here), if you really need to try them all (it would be fun to say the table is ordered in terms of the destruction they may cause, but that is not completely true).
The signals with an asterisk (*) are NOT recommended. The important thing about these is that you may never know what it is programmed to do. Specially SIGUSR! It may start the apocalipse (it is a free signal for a programmer do whatever he/she wants!). But, if not handled OR in the unlikely case it is handled to terminate, the program will terminate.
In the table, the signals with default options to terminate and generate a core dump are left in the end, just before SIGKILL.
Signal Value Action Comment
----------------------------------------------------------------------
SIGTERM 15 Term Termination signal
SIGINT 2 Term Famous CONTROL+C interrupt from keyboard
SIGHUP 1 Term Disconnected terminal or parent died
SIGPIPE 13 Term Broken pipe
SIGALRM(*) 14 Term Timer signal from alarm
SIGUSR2(*) 12 Term User-defined signal 2
SIGUSR1(*) 10 Term User-defined signal 1
SIGQUIT 3 Core CONTRL+\ or quit from keyboard
SIGABRT 6 Core Abort signal from abort(3)
SIGSEGV 11 Core Invalid memory reference
SIGILL 4 Core Illegal Instruction
SIGFPE 8 Core Floating point exception
SIGKILL 9 Term Kill signal
Then I would suggest for this almost stupid long answer:
SIGTERM, SIGINT, SIGHUP, SIGPIPE, SIGQUIT, SIGABRT, SIGKILL
And finally, the
Definitely Stupid Long Long Answer:
Don't try this at home.
SIGTERM, SIGINT, SIGHUP, SIGPIPE, SIGALRM, SIGUSR2, SIGUSR1, SIGQUIT, SIGABRT, SIGSEGV, SIGILL, SIGFPE and if nothing worked, SIGKILL.
SIGUSR2 should be tried before SIGUSR1 because we are better off if the program doesn't handle the signal. And it is much more likely for it to handle SIGUSR1 if it handles just one of them.
BTW, the KILL: it is not wrong to send SIGKILL to a process, as other answer stated. Well, think what happens when you send a shutdown command? It will try SIGTERM and SIGKILL only. Why do you think that is the case? And why do you need any other signals, if the very shutdown command uses only these two?
Now, back to the long answer, this is a nice oneliner:
for SIG in 15 2 3 6 9 ; do echo $SIG ; echo kill -$SIG $PID || break ; sleep 30 ; done
It sleeps for 30 seconds between signals. Why else would you need a oneliner? ;)
Also, recommended: try it with only signals 15 2 9 from the reasonable answer.
safety: remove the second echo when you are ready to go. I call it my dry-run for onliners. Always use it to test.
Script killgracefully
Actually I was so intrigued by this question that I decided to create a small script to do just that. Please, feel free to download (clone) it here:
GitHub link to Killgracefully repository

Typically you'd send SIGTERM, the default of kill. It's the default for a reason. Only if a program does not shutdown in a reasonable amount of time should you resort to SIGKILL. But note that with SIGKILL the program has no possibility to clean things up und data could be corrupted.
As for SIGHUP, HUP stands for "hang up" and historically meant that the modem disconnected. It's essentially equivalent to SIGTERM. The reason that daemons sometimes use SIGHUP to restart or reload config is that daemons detach from any controlling terminals as a daemon doesn't need those and therefore would never receive SIGHUP, so that signal was considered as "freed up" for general use. Not all daemons use this for reload! The default action for SIGHUP is to terminate and many daemons behave that way! So you can't go blindly sending SIGHUPs to daemons and expecting them to survive.
Edit: SIGINT is probably inappropriate to terminate a process, as it's normally tied to ^C or whatever the terminal setting is to interrupt a program. Many programs capture this for their own purposes, so it's common enough for it not to work. SIGQUIT typically has the default of creating a core dump, and unless you want core files laying around it's not a good candidate, either.
Summary: if you send SIGTERM and the program doesn't die within your timeframe then send it SIGKILL.

SIGTERM actually means sending an application a message: "would you be so kind and commit suicide". It can be trapped and handled by application to run cleanup and shutdown code.
SIGKILL cannot be trapped by application. Application gets killed by OS without any chance for cleanup.
It's typical to send SIGTERM first, sleep some time, then send SIGKILL.

SIGTERM is equivalent to "clicking the 'X' " in a window.
SIGTERM is what Linux uses first, when it is shutting down.

With all the discussion going on here, no code has been offered. Here's my take:
#!/bin/bash
$pid = 1234
echo "Killing process $pid..."
kill $pid
waitAttempts=30
for i in $(seq 1 $waitAttempts)
do
echo "Checking if process is alive (attempt #$i / $waitAttempts)..."
sleep 1
if ps -p $pid > /dev/null
then
echo "Process $pid is still running"
else
echo "Process $pid has shut down successfully"
break
fi
done
if ps -p $pid > /dev/null
then
echo "Could not shut down process $pid gracefully - killing it forcibly..."
kill -SIGKILL $pid
fi

HUP sounds like rubbish to me. I'd send it to get a daemon to re-read its configuration.
SIGTERM can be intercepted; your daemons just might have clean-up code to run when it receives that signal. You cannot do that for SIGKILL. Thus with SIGKILL you are not giving the daemon's author any options.
More on that on Wikipedia

Related

kill process using sigterm and escalate to sigkill after timeout

Is there a way to sigterm a process with a timeout? If the process does not gracefully terminate within 30 minutes, the process should get sigkill. Ideally, this graceful shutdown should be executed on the background.

There's the timeout command, which allows you to cap a process' execution time and escalate to a SIGKILL if it doesn't respond promptly to the initial signal (SIGTERM by default). This isn't quite what you're asking for, but it might be sufficient.
To do what you're actually describing (send a signal, briefly await, then send a kill) you may have to do a bit of bookkeeping yourself, as this question details.
One option would be to use Upstart (or I imagine other service managers), which provides a kill timeout n command that does what you want.
As an aside, many systems would treat 30 minutes as much too long to wait for SIGTERM. Linux does something akin to what you're describing on shutdown, for instance, but gives processes barely a few seconds to clean up and exit before SIGKILLing them. For other use cases you certainly can have a long-lived termination like you describe (e.g. with Upstart), but YMMV.

waitpid in infitine wait state after PTRACE_ATTACH

I have integrated Google-Breakpad in my C++ application. Now, I am deliberately crashing the application but it hangs-up in my Ubuntu i686 system. I have to put printf everywhere in Breakpad to check where exactly it is hanging. So, in breakpad, a clone child process is being created and in that process ptrace(PTRACE_ATTACH, pid, NULL, NULL) followed by waitpid(pid, NULL, __WALL) syscall is being called on every thread. With one particular thread waitpid is entering in infinite wait state and I then have to deliberately kill the application.
Does anyone knows why exactly this is happening? Why with this one particular thread waitpid() is going in infinte wait state? Is there any solution for the same?
Thanks.

In general, PTRACE_ATTACH does not guarantee that a process will have anything to report. After PTRACE_ATTACH, waitpid will trigger only if one of two things happen:
The debugee receives a signal.
The debugee exists.
Some things are tantamount to one of those things. For example, if the debugee calls execve, then after a successful execution the kernel makes it appear as if the debugee received a TRAP signal.
If none of those situations happen, there is no reason for PTRACE_ATTACH to do anything at all.
If you want waitpid to return (say, because you want to change the debugee's state), then simply send a signal to the thread after calling PTRACE_ATTACH. This will guarantee that the thread have something to tell you.

Inter-process communication in C

I have a scenario, where one process should wait for a signal from another process, and this wait should be blocking wait, and as soon as it gets a signal, it should wake up.
However, with mechanisms like kill() or raise(), the first process goes to wait state, but periodically checks after a specified amount of time, whether the even/signal occurred or not, and decides to wait or go on.
My requirement is a bit stringent, I want that process should wake up at the same instant as signal is received.
Please suggest something.

This can be achieved using semaphore,mutex or conditional variable. Or You can write wait and signal function by your own and you can control the behavior of these as per need. For reference see here: IPC examples
IPC concept and Examples Mutex and Conditional Variables

Resque: Does dequeueing kill the process?

I'm implementing resque on this project where I need the feature of killing whatever gets enqueued to resque. So, I've seen that there is a dequeuing method, which will remove the jobs from the queue. But, if this job has already been started, and is currently running, does dequeuing kill the process?
Also important: If a job gets dequeued, do I get a handle where I can do something, or is an exception thrown?

As far I know it don't kill the process its just remove the job from the queue if it exist check here
But if you want to achieve killing a job perhaps then you need to use various signal that resque provide
Here a list of them
Resque workers respond to a few different signals:
QUIT - Wait for child to finish processing then exit
TERM / INT - Immediately kill child then exit
USR1 - Immediately kill child but don't exit
USR2 - Don't start to process any new jobs
CONT - Start to process new jobs again after a USR2
In your case if would be USR1
Hope this help

The answer to this issue was actually using one of the many extensions for the resque gem, called resque-status. This handles worker instances, assignes a unique id to each of them (which I can use to identify them, feature I needed the most) and provides me with a kill method to be called on a job, which will guarantee that the job will process the kill signal the next time I call a certain method of their API (not exactly a kill and assign exception, but it's better than nothing).

PHP CLI in Windows: Handling Ctrl-C commands?

How can I handle CTRL+C in PHP on the command line? Pcntl_* functions do not work in Windows.

The following works on unix systems.
We can catch keys using stream_get_contents(), but it does not catch the CTRL key. Also filtering ^C does not works.
What we need to do is to catch the SIGINT posix signal.
To supress CTRL + c default behavior.
Program won't quit, you need then to implement another way of exiting!:
function shutdown(){};
pcntl_signal(SIGINT,"shutdown");
To handle CTRL + c, and run some code before exiting:
function shutdown(){
echo "\033c"; // Clear terminal
system("tput cnorm && tput cup 0 0 && stty echo"); // Restore cursor default
echo PHP_EOL; // New line
exit; // Clean quit
}
register_shutdown_function("shutdown"); // Handle END of script
declare(ticks = 1); // Allow posix signal handling
pcntl_signal(SIGINT,"shutdown"); // Catch SIGINT, run shutdown()
List of POSIX signals:
Php won't catch SIGKILL, can't be.
SIGABRT and SIGIOT
The SIGABRT and SIGIOT signal is sent to a process to tell it to abort, i.e. to terminate. The signal is usually initiated by the process itself when it calls abort() function of the C Standard Library, but it can be sent to the process from outside like any other signal.
SIGALRM, SIGVTALRM and SIGPROF
The SIGALRM, SIGVTALRM and SIGPROF signal is sent to a process when the time limit specified in a call to a preceding alarm setting function (such as setitimer) elapses. SIGALRM is sent when real or clock time elapses. SIGVTALRM is sent when CPU time used by the process elapses. SIGPROF is sent when CPU time used by the process and by the system on behalf of the process elapses.
SIGBUS
The SIGBUS signal is sent to a process when it causes a bus error. The conditions that lead to the signal being sent are, for example, incorrect memory access alignment or non-existent physical address.
SIGCHLD
The SIGCHLD signal is sent to a process when a child process terminates, is interrupted, or resumes after being interrupted. One common usage of the signal is to instruct the operating system to clean up the resources used by a child process after its termination without an explicit call to the wait system call.
SIGCONT
The SIGCONT signal instructs the operating system to continue (restart) a process previously paused by the SIGSTOP or SIGTSTP signal. One important use of this signal is in job control in the Unix shell.
SIGFPE
The SIGFPE signal is sent to a process when it executes an erroneous arithmetic operation, such as division by zero. This may include integer division by zero, and integer overflow in the result of a divide (only INT_MIN/-1, INT64_MIN/-1 and %-1 accessible from C).[2][3].
SIGHUP
The SIGHUP signal is sent to a process when its controlling terminal is closed. It was originally designed to notify the process of a serial line drop (a hangup). In modern systems, this signal usually means that the controlling pseudo or virtual terminal has been closed.[4] Many daemons will reload their configuration files and reopen their logfiles instead of exiting when receiving this signal.[5] nohup is a command to make a command ignore the signal.
SIGILL
The SIGILL signal is sent to a process when it attempts to execute an illegal, malformed, unknown, or privileged instruction.
SIGINT
The SIGINT signal is sent to a process by its controlling terminal when a user wishes to interrupt the process. This is typically initiated by pressing Ctrl+C, but on some systems, the "delete" character or "break" key can be used.[6]
SIGKILL
The SIGKILL signal is sent to a process to cause it to terminate immediately (kill). In contrast to SIGTERM and SIGINT, this signal cannot be caught or ignored, and the receiving process cannot perform any clean-up upon receiving this signal. The following exceptions apply:
Zombie processes cannot be killed since they are already dead and waiting for their parent processes to reap them.
Processes that are in the blocked state will not die until they wake up again.
The init process is special: It does not get signals that it does not want to handle, and thus it can ignore SIGKILL.[7] An exception from this exception is while init is ptraced on Linux.[8][9]
An uninterruptibly sleeping process may not terminate (and free its resources) even when sent SIGKILL. This is one of the few cases in which a UNIX system may have to be rebooted to solve a temporary software problem.
SIGKILL is used as a last resort when terminating processes in most system shutdown procedures if it does not voluntarily exit in response to SIGTERM. To speed the computer shutdown procedure, Mac OS X 10.6, aka Snow Leopard, will send SIGKILL to applications that have marked themselves "clean" resulting in faster shutdown times with, presumably, no ill effects.[10] The command killall -9 has a similar, while dangerous effect, when executed e.g. in Linux; it doesn't let programs save unsaved data. It has other options, and with none, uses the safer SIGTERM signal.
SIGPIPE
The SIGPIPE signal is sent to a process when it attempts to write to a pipe without a process connected to the other end.
SIGPOLL
The SIGPOLL signal is sent when an event occurred on an explicitly watched file descriptor.[11] Using it effectively leads to making asynchronous I/O requests since the kernel will poll the descriptor in place of the caller. It provides an alternative to active polling.
SIGRTMIN to SIGRTMAX
The SIGRTMIN to SIGRTMAX signals are intended to be used for user-defined purposes. They are real-time signals.
SIGQUIT
The SIGQUIT signal is sent to a process by its controlling terminal when the user requests that the process quit and perform a core dump.
SIGSEGV
The SIGSEGV signal is sent to a process when it makes an invalid virtual memory reference, or segmentation fault, i.e. when it performs a segmentation violation.[12]
SIGSTOP
The SIGSTOP signal instructs the operating system to stop a process for later resumption.
SIGSYS
The SIGSYS signal is sent to a process when it passes a bad argument to a system call. In practice, this kind of signal is rarely encountered since applications rely on libraries (e.g. libc) to make the call for them. SIGSYS can be received by applications violating the Linux Seccomp security rules configured to restrict them.
SIGTERM
The SIGTERM signal is sent to a process to request its termination. Unlike the SIGKILL signal, it can be caught and interpreted or ignored by the process. This allows the process to perform nice termination releasing resources and saving state if appropriate. SIGINT is nearly identical to SIGTERM.
SIGTSTP
The SIGTSTP signal is sent to a process by its controlling terminal to request it to stop (terminal stop). It is commonly initiated by the user pressing Ctrl+Z. Unlike SIGSTOP, the process can register a signal handler for, or ignore, the signal.
SIGTTIN and SIGTTOU
The SIGTTIN and SIGTTOU signals are sent to a process when it attempts to read in or write out respectively from the tty while in the background. Typically, these signals are received only by processes under job control; daemons do not have controlling terminals and, therefore, should never receive these signals.
SIGTRAP
The SIGTRAP signal is sent to a process when an exception (or trap) occurs: a condition that a debugger has requested to be informed of – for example, when a particular function is executed, or when a particular variable changes value.
SIGURG
The SIGURG signal is sent to a process when a socket has urgent or out-of-band data available to read.
SIGUSR1 and SIGUSR2
The SIGUSR1 and SIGUSR2 signals are sent to a process to indicate user-defined conditions.
SIGXCPU
The SIGXCPU signal is sent to a process when it has used up the CPU for a duration that exceeds a certain predetermined user-settable value.[13] The arrival of a SIGXCPU signal provides the receiving process a chance to quickly save any intermediate results and to exit gracefully, before it is terminated by the operating system using the SIGKILL signal.
SIGXFSZ
The SIGXFSZ signal is sent to a process when it grows a file that exceeds the maximum allowed size.
SIGWINCH
The SIGWINCH signal is sent to a process when its controlling terminal changes its size (a window change).[14]

As of PHP 7.4, this is now possible by registering a handler callback with the sapi_windows_set_ctrl_handler() function.
This is complemented by sapi_windows_generate_ctrl_event(), which can be used to dispatch signals to other processes attached to the same console as the caller.
Only the CTRL-C and CTRL-BREAK events can be handled in user space, the close/log-off/shutdown events cannot be implented safely as the operating system will likely be in an unpredictable state of partial shutdown by the time the handler function is invoked, so there is a risk that any code executed at this point will do more harm than good.
You can find more information about the underlying mechanism on MSDN:
SetConsoleCtrlHandler()
GenerateConsoleCtrlEvent()
The PHP API is almost identical to the underlying C API, the only notable difference being that PHP only permits a single callback to be registered, and consequently the handler does does not have a meaningful return value, the engine simply marks the events as handled. This is in order to keep the implementation simple, as a stack of functions can easily be implemented in userland, and likewise if you don't want to handle an event you can simply call exit.

If you want to run a task in PHP via command line that takes a very long time, I would try to organize it in badges and keep track of what is already done.
Now you can completely process each badge (ex: process and then store it in an xml file) and not only after the whole list is processed. So a crash/stop in between will only cancel one badge and not all of them.
If you store your current position after each badge somewhere, you can easily resume when your script crashes or is stopped.
Now if you check the OS process-list to see if your script is running, you can write a cron job that starts your script every X minutes if it had crashed and was not already running.
So, TL;DR
Process job in small badges
Store position of last successfully processed badge
Check for already running process at start
Continually start script until all are happy!
That aside, I like PHP for small command line jobs but if you have such a large task, something else might be better suited. Check for something that can run stable for a long time and has a means of showing it's progress. Maybe a small C# app with a minimalistic gui.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio