killing shell process and child oracle process - oracle

From my shell script I am killing my background function process using kill command. This function calls SQL procedure using sqlplus:
func_foo(){
retval=`sqlplus -s $USER_NAME/$PWD <<EOF
set pages 0 lines 120 trimout on trimspool on tab off echo off verify off feed off serverout on
exec pkg_xyz.proc_abc();
exit;
EOF`
}
func_foo&
pid_func_foo=$!
sleep 5
kill $pid_func_foo 2>/dev/null
wait $pid_func_foo 2>/dev/null
Problem with the approach is that even if my function process is killed, Oracle process keeps on running. Oracle process is not getting killed. I am new to oracle, I am not sure how to handle this scenario. Please provide me with the hint on how to handle this scenario.

Killing the Oracle processes is a bad idea. Try to solve your problem in another way.
Run your procedure as a job, using dbms_scheduler. You can simply stop the job when needed by calling dbms_scheduler.stop_job('job name').
Build your procedure so it can be stopped programmatically. I have build a couple of procedures that run for a very long time. Every now and then the procedure checks a table called "Status", containing only one row. If the status is "ok", it runs on. If I change the row to something else, the procedure sees this and stops.

Hitting control-c in an interactive SQL*Plus session terminates the running command, generates an informational ORA-01013 message, and leaves you at the SQL*Plus prompt - with the Oracle process still alive but idle (possibly oversimplifying somewhat).
You can get the equivalent effect by sending an interrupt signal, rather than default termination signal. This might vary slightly depending on your OS and shell, but is usually something like:
kill -int $pid_func_foo 2>/dev/null
This should still generate the ORA-01013 and the sqlplus process will continue. But as the next statement in your 'here document' is exit it will still stop and will do so more naturally than with a termination signal, and the Oracle session will clear down normally, removing the Oracle process. (If your procedure is doing any inserts or updates, there may still be a delay while the transaction rolls back).
I'm not sure this is a particularly good way to manage execution time limits; job control or resource management might be a better way to go.

Related

DBMS_LOCK.SLEEP vs UNIX sleep

I have a shell script which will trigger a PL/SQL report generation procedure after certain pre-conditions are satisfied. The logic for checking whether the pre-conditions are fulfilled is written in PL/SQL package. The report generation needs to wait until the pre-conditions are not fulfilled.
What are the pros and cons of waiting using dbms_lock.sleep inside PL/SQL procedure vs UNIX sleep?
Like a lot of design decisions the answer is, it depends.
Database connections are expensive and relatively time consuming operations. So probably the more efficient approach would be to connect to the database once and let the PL/SQL job handle the waiting process.
Also it's probably cleaner to have a simple PL/SQL call and let the database handle the report or sleep logic rather than write an API that returns a state which the calling program must interpret and act on. This also gives you a neater path to alternative execution (say by calling from a GUI or a DBMS_SCHEDULER job).
There are two specific advantages of using a shell script sleep:
You have the option of emitting a status every time the loop enters sleep mode (if this is interactive)
Execute on sys.dbms_lock is not granted to anybody by default. Some DBAs can be reluctant to grant execute on that package.

SQLCBLLE not running correctly and does not produce MSGW

I have been facing this problem lately :
normaly when there is a pgm ILE COBOL running on batch job on IBM i-series (AS/400) and triggers an exeption it makes the batch job stop et go from RUN to MSGW, but when it is a SQLCBLLE and there is a problem executing an sql statement it simply rolls back and continues execution without passing the job to MSGW.
Is there a way to know if an sqlcblle in a batch job has not executed correctly and if there is a possibility to trigger MSGW for the batch job and let the default error handler get them ?
Every SQL statement should be followed by a test that checks SQLSTATE (or possibly SQLCODE) to see if the SQL succeeded. Depending on the SQLSTATE (or perhaps SQLCODE) value, the program needs to decide what action to take.
The action can be to send a *INQ message to put the job into MSGW status until a reply is returned.
Without seeing code that causes a problem, it's difficult to say much more. A statement such as exec sql select * from tableA already has a potentially significant problem by not specifying a column list, regardless of the existence of tableA. Embedded SQL generally will not cause an exception to be returned, but will use SQLSTATE to describe problems. It's the developer's responsibility to check for those returned conditions.
There is an interesting discussion that may be helpful here. It's about RPG rather than CBL but may be useful in solving your problem.

How to restart a program in terminal periodically?

I am calling a program, let say myprogram, from the terminal (in OS X Mavericks) but some times it gets stuck due to external problems out of my control. This tends to happen approximately every half an hour.
myprogram basically has to perform a large quantity of small subtasks, which are saved in a file that is read in every new execution, so there is no need to recompute everything from the beginning.
I would like to fully automatize the restarting of the program by killing and restarting it again, in the following way:
Start the program.
Kill it after 30 minutes (the program will be probably stuck).
Restart it (back to step 1).
Any ideas on how to do this? My knowledge of bash scripting is not great precisely...
The following script can serve as a wrapper script for myprogram
#!/bin/bash
while true #begin infinite loop (you'll have to manually kill)
do
./myprogram & #execute myprogram and background
PID=$! #get PID of myprogram
sleep 1800 #sleep 30 minutes (30m might work as parameter)
kill -9 $PID #kill myprogram
done
You could use a wrapper, but, an infinite loop is not an optimal solution. If you are looking to relaunch a program on timer, or not, depending on the exit code and are on OS X, you should use launchd configuration (xml property list) files and load them with launchctl.
KeepAlive <boolean or dictionary of stuff>
This optional key is used to control whether your job is to be kept continuously running or to let
demand and conditions control the invocation. The default is false and therefore only demand will start
the job. The value may be set to true to unconditionally keep the job alive. Alternatively, a dictio-nary dictionary
nary of conditions may be specified to selectively control whether launchd keeps a job alive or not. If
multiple keys are provided, launchd ORs them, thus providing maximum flexibility to the job to refine
the logic and stall if necessary. If launchd finds no reason to restart the job, it falls back on
demand based invocation. Jobs that exit quickly and frequently when configured to be kept alive will
be throttled to converve system resources.
SuccessfulExit <boolean>
If true, the job will be restarted as long as the program exits and with an exit status of zero.
If false, the job will be restarted in the inverse condition. This key implies that "RunAtLoad"
is set to true, since the job needs to run at least once before we can get an exit status.
...
ExitTimeOut <integer>
The amount of time launchd waits before sending a SIGKILL signal. The default value is 20 seconds. The
value zero is interpreted as infinity.
For more information on launchd & plists visit :
https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingLaunchdJobs.html
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man5/launchd.plist.5.html

Asterisk: don't wait for AGI script (bash) to finish before continuing in dialplan

I have an Asterisk dialplan that executes a bash script that matches the callerID with a database to geolocate the caller (by matching country and area codes). Since the database is quite large (global scale), it takes up to 15 seconds to finish.
I need to run this script immediately after answering the call (in case the user hangs up before the call is finished), but don't want the user to wait for the script execution. The return values should ideally be processed at the end of the dialplan just before the hangup.
Q1: I found http://www.voip-info.org/wiki/view/Asterisk+AGI#Forkandcontinuedialplan which deals with my problem in regards to perl scripts. How do i accomplish the same in bash? I know I can send any bash script to the background by adding a "&" at the end, but I'm clueless how to do that in the dialplan / when using AGI scripts.
Q2: How can I process the values even if the user hung up before / the dialplan "exited non-zero"?
Thanks for your help!
Use fastagi interface. Or fire UserEvent with AMI listener.
AGI is not designed to work like you want, so it will not work.
Sure you can use nohup command to get immortal bash script, but that is not the way it have be.

In what order should I send signals to gracefully shutdown processes?

In a comment on this answer of another question, the commenter says:
don’t use kill -9 unless absolutely
necessary! SIGKILL can’t be trapped so
the killed program can’t run any
shutdown routines to e.g. erase
temporary files. First try HUP (1),
then INT (2), then QUIT (3)
I agree in principle about SIGKILL, but the rest is news to me. Given that the default signal sent by kill is SIGTERM, I would expect it is the most-commonly expected signal for graceful shutdown of an arbitrary process. Also, I have seen SIGHUP used for non-terminating reasons, such as telling a daemon "re-read your config file." And it seems to me that SIGINT (the same interrupt you'd typically get with Ctrl-C, right?) isn't as widely supported as it ought to be, or terminates rather ungracefully.
Given that SIGKILL is a last resort — Which signals, and in what order, should you send to an arbitrary process, in order to shut it down as gracefully as possible?
Please substantiate your answers with supporting facts (beyond personal preference or opinion) or references, if you can.
Note: I am particularly interested in best practices that include consideration of bash/Cygwin.
Edit: So far, nobody seems to mention INT or QUIT, and there's limited mention of HUP. Is there any reason to include these in an orderly process-killing?
SIGTERM tells an application to terminate. The other signals tell the application other things which are unrelated to shutdown but may sometimes have the same result. Don't use those. If you want an application to shut down, tell it to. Don't give it misleading signals.
Some people believe the smart standard way of terminating a process is by sending it a slew of signals, such as HUP, INT, TERM and finally KILL. This is ridiculous. The right signal for termination is SIGTERM and if SIGTERM doesn't terminate the process instantly, as you might prefer, it's because the application has chosen to handle the signal. Which means it has a very good reason to not terminate immediately: It's got cleanup work to do. If you interrupt that cleanup work with other signals, there's no telling what data from memory it hasn't yet saved to disk, what client applications are left hanging or whether you're interrupting it "mid-sentence" which is effectively data corruption.
For more information on what the real meaning of the signals is, see sigaction(2). Don't confuse "Default Action" with "Description", they are not the same thing.
SIGINT is used to signal an interactive "keyboard interrupt" of the process. Some programs may handle the situation in a special way for the purpose of terminal users.
SIGHUP is used to signal that the terminal has disappeared and is no longer looking at the process. That is all. Some processes choose to shut down in response, generally because their operation makes no sense without a terminal, some choose to do other things such as recheck configuration files.
SIGKILL is used to forcefully remove the process from the kernel. It is special in the sense that it's not actually a signal to the process but rather gets interpreted by the kernel directly.
Don't send SIGKILL. - SIGKILL should certainly never be sent by scripts. If the application handles the SIGTERM, it can take it a second to cleanup, it can take a minute, it can take an hour. Depending on what the application has to get done before it's ready to end. Any logic that "assumes" an application's cleanup sequence has taken long enough and needs to be shortcut or SIGKILLed after X seconds is just plain wrong.
The only reason why an application would need a SIGKILL to terminate, is if something bugged out during its cleanup sequence. In which case you can open a terminal and SIGKILL it manually. Aside from that, the only one other reason why you'd SIGKILL something is because you WANT to prevent it from cleaning itself up.
Even though half the world blindly sends SIGKILL after 5 seconds it's still horribly wrong thing to do.
Short Answer: Send SIGTERM, 30 seconds later, SIGKILL. That is, send SIGTERM, wait a bit (it may vary from program to program, you may know your system better, but 5 to 30 seconds is enough. When shutting down a machine, you may see it automatically waiting up to 1'30s. Why the hurry, after all?), then send SIGKILL.
Reasonable Answer: SIGTERM, SIGINT, SIGKILL
This is more than enough. The process will very probably terminate before SIGKILL.
Long Answer: SIGTERM, SIGINT, SIGQUIT, SIGABRT, SIGKILL
This is unnecessary, but at least you are not misleading the process regarding your message. All these signals do mean you want the process to stop what it is doing and exit.
No matter what answer you choose from this explanation, keep that in mind!
If you send a signal that means something else, the process may handle it in very different ways (on one hand). On the other hand, if the process doesn't handle the signal, it doesn't matter what you send after all, the process will quit anyway (when the default action is to terminate, of course).
So, you must think as yourself as a programmer. Would you code a function handler for, lets say, SIGHUP to quit a program that connects with something, or would you loop it to try to connect again? That is the main question here! That is why it is important to just send signals that mean what you intend.
Almost Stupid Long Answer:
The table bellow contains the relevant signals, and the default actions in case the program does not handle them.
I ordered them in the order I suggest to use (BTW, I suggest you to use the reasonable answer, not this one here), if you really need to try them all (it would be fun to say the table is ordered in terms of the destruction they may cause, but that is not completely true).
The signals with an asterisk (*) are NOT recommended. The important thing about these is that you may never know what it is programmed to do. Specially SIGUSR! It may start the apocalipse (it is a free signal for a programmer do whatever he/she wants!). But, if not handled OR in the unlikely case it is handled to terminate, the program will terminate.
In the table, the signals with default options to terminate and generate a core dump are left in the end, just before SIGKILL.
Signal Value Action Comment
----------------------------------------------------------------------
SIGTERM 15 Term Termination signal
SIGINT 2 Term Famous CONTROL+C interrupt from keyboard
SIGHUP 1 Term Disconnected terminal or parent died
SIGPIPE 13 Term Broken pipe
SIGALRM(*) 14 Term Timer signal from alarm
SIGUSR2(*) 12 Term User-defined signal 2
SIGUSR1(*) 10 Term User-defined signal 1
SIGQUIT 3 Core CONTRL+\ or quit from keyboard
SIGABRT 6 Core Abort signal from abort(3)
SIGSEGV 11 Core Invalid memory reference
SIGILL 4 Core Illegal Instruction
SIGFPE 8 Core Floating point exception
SIGKILL 9 Term Kill signal
Then I would suggest for this almost stupid long answer:
SIGTERM, SIGINT, SIGHUP, SIGPIPE, SIGQUIT, SIGABRT, SIGKILL
And finally, the
Definitely Stupid Long Long Answer:
Don't try this at home.
SIGTERM, SIGINT, SIGHUP, SIGPIPE, SIGALRM, SIGUSR2, SIGUSR1, SIGQUIT, SIGABRT, SIGSEGV, SIGILL, SIGFPE and if nothing worked, SIGKILL.
SIGUSR2 should be tried before SIGUSR1 because we are better off if the program doesn't handle the signal. And it is much more likely for it to handle SIGUSR1 if it handles just one of them.
BTW, the KILL: it is not wrong to send SIGKILL to a process, as other answer stated. Well, think what happens when you send a shutdown command? It will try SIGTERM and SIGKILL only. Why do you think that is the case? And why do you need any other signals, if the very shutdown command uses only these two?
Now, back to the long answer, this is a nice oneliner:
for SIG in 15 2 3 6 9 ; do echo $SIG ; echo kill -$SIG $PID || break ; sleep 30 ; done
It sleeps for 30 seconds between signals. Why else would you need a oneliner? ;)
Also, recommended: try it with only signals 15 2 9 from the reasonable answer.
safety: remove the second echo when you are ready to go. I call it my dry-run for onliners. Always use it to test.
Script killgracefully
Actually I was so intrigued by this question that I decided to create a small script to do just that. Please, feel free to download (clone) it here:
GitHub link to Killgracefully repository
Typically you'd send SIGTERM, the default of kill. It's the default for a reason. Only if a program does not shutdown in a reasonable amount of time should you resort to SIGKILL. But note that with SIGKILL the program has no possibility to clean things up und data could be corrupted.
As for SIGHUP, HUP stands for "hang up" and historically meant that the modem disconnected. It's essentially equivalent to SIGTERM. The reason that daemons sometimes use SIGHUP to restart or reload config is that daemons detach from any controlling terminals as a daemon doesn't need those and therefore would never receive SIGHUP, so that signal was considered as "freed up" for general use. Not all daemons use this for reload! The default action for SIGHUP is to terminate and many daemons behave that way! So you can't go blindly sending SIGHUPs to daemons and expecting them to survive.
Edit: SIGINT is probably inappropriate to terminate a process, as it's normally tied to ^C or whatever the terminal setting is to interrupt a program. Many programs capture this for their own purposes, so it's common enough for it not to work. SIGQUIT typically has the default of creating a core dump, and unless you want core files laying around it's not a good candidate, either.
Summary: if you send SIGTERM and the program doesn't die within your timeframe then send it SIGKILL.
SIGTERM actually means sending an application a message: "would you be so kind and commit suicide". It can be trapped and handled by application to run cleanup and shutdown code.
SIGKILL cannot be trapped by application. Application gets killed by OS without any chance for cleanup.
It's typical to send SIGTERM first, sleep some time, then send SIGKILL.
SIGTERM is equivalent to "clicking the 'X' " in a window.
SIGTERM is what Linux uses first, when it is shutting down.
With all the discussion going on here, no code has been offered. Here's my take:
#!/bin/bash
$pid = 1234
echo "Killing process $pid..."
kill $pid
waitAttempts=30
for i in $(seq 1 $waitAttempts)
do
echo "Checking if process is alive (attempt #$i / $waitAttempts)..."
sleep 1
if ps -p $pid > /dev/null
then
echo "Process $pid is still running"
else
echo "Process $pid has shut down successfully"
break
fi
done
if ps -p $pid > /dev/null
then
echo "Could not shut down process $pid gracefully - killing it forcibly..."
kill -SIGKILL $pid
fi
HUP sounds like rubbish to me. I'd send it to get a daemon to re-read its configuration.
SIGTERM can be intercepted; your daemons just might have clean-up code to run when it receives that signal. You cannot do that for SIGKILL. Thus with SIGKILL you are not giving the daemon's author any options.
More on that on Wikipedia

Resources