how to timeout a linux script - bash

coreutils timeout and other timeout script i searched, they apply for a CDM
but i'd like to apply timeout for a linux script, if not finished for a period. like:
cd XXX && CMD && sleep 3 && kill -0 XX
How to do it?

You can pass the spawning of a subshell to timeout, and have the subshell run the code that needs to be timed out:
#!/bin/bash
timeout 5 bash -c "ping google.com -c 2; ping yahoo.com -c 10"
If you clarify what you need exactly there may be cleaner ways to achieve this.

Related

How to make Bash background command exit when parent exits?

I would like python3 command to exit when parent exits (after 10 seconds).
#! /bin/bash
sudo bash -c "python3 screen-buttons.py &"
sleep 10
printf "%s\n" "Done"
Well, you have to know the pid you want to kill, one way or the other.
#!/bin/bash
# assuming screen-buttons.py does not output anything to stdout
pid=$(sudo bash -c 'python3 screen-buttons.py & echo $!')
sleep 10
kill "$pid"
If screen-buttons.py outputs something to stdout, save the pid to a file and read it from parent.
But it looks like you want to implement a timeout. If so see.. timeout utility.
To kill all descendant see https://unix.stackexchange.com/questions/124127/kill-all-descendant-processes
And anyway, why call bash and background in child process? Just call what you want to call.
#!/bin/bash
sudo python3 screen-buttons.py &
pid=$!
sleep 10
sudo kill "$pid"
and anyway:
sudo timeout 10 python3 screen-buttons.py

Using GNU timeout with SSH -t in a bash script to prevent hanging

I have a script that ssh's to some servers. Sometimes an unexpected problem causes ssh to hang indefinitely. I want to avoid this by killing ssh if it runs too long.
I'm also using a wrapper function for input redirection. I need to force a tty with the -t flag to make a process on the server happy.
function _redirect {
if [ "$DEBUG" -eq 0 ]; then
$* 1> /dev/null 2>&1
else
$*
fi
return $?
exit
}
SSH_CMD="ssh -t -o BatchMode=yes -l robot"
SERVER="192.168.1.2"
ssh_script=$(cat <<EOF
sudo flock -w 60 -n /path/to/lock -c /path/to/some_golang_binary
EOF
)
_redirect timeout 1m $SSH_CMD $SERVER "($ssh_script)"
The result is a timeout with this message printed:
tcsetattr: Interrupted system call
The expected result is either the output of the remote shell command, or a timeout and proper exit code.
when I type
timeout 1m ssh -t -o BatchMode=yes -o -l robot 192.168.1.2 \
"(sudo sudo flock -w 60 -n /path/to/lock -c /path/to/some_golang_binary)" \
1> /dev/null
I get the expected result.
I suspect these two things:
1)The interaction between GNU timeout and ssh is causing the tcsetattr system call to take a very long time (or hang), then timeout sends a SIGTERM to interrupt it and it prints that message. There is no other output because this call is one of the first things done. I wonder if timeout launches ssh in a child process that cannot have a terminal, then uses its main process to count time and kill its child.
I looked here for the reasons this call can fail.
2) _redirect needs a different one of $#, $*, "$#", "$*" etc. Some bad escaping/param munging breaks the arguments to timeout which causes this tcsetattr error. Trying various combinations of this has not yet solved the problem.
What fixed this was --foreground flag to timeout.

Execute a timed function in bash

I am trying to implement a timed function. If the timer times out the function/command should be killed. If the function/command finishes, the timer should not make the bash to wait for the timer to timeout.
(cmdpid=$BASHPID; \
( sleep 60; kill $cmdpid 2>/dev/null) & \
child_pid=$!; \
ssh remote_host /users/jj/test.sh; \
kill -9 $child_pid)
The test.sh may or may not finish in 60 seconds. This worked fine.
But when I want to get the result of the test.sh, which echoes "SUCESS" or "FAILURE", I tried with
result=$(cmdpid=$BASHPID; \
( sleep 60; kill $cmdpid 2>/dev/null) & \
child_pid=$!; \
ssh remote_host /users/jj/test.sh; \
kill -9 $child_pid)
Here it waits for timer to exit. I can see the "kill -9 $child_pid" is executed, using set -x command, but the kill is not really killing the sub-shell.
One way to tackle this problem would be to run the timer on a separate script, say MyTimerTest, which is called from the (say) MainScriptTest but runs separately, and then whichever script that finishes first "kills" the other. For example:
On MainScriptTest you could put this at the beginning:
nohup /folder/MyTimerTest > /dev/null 2>&1 &
On MainScriptTest you could put this at the very end:
killall MyTimerTest > /dev/null 2>&1
The MyTimerTest could be something like this:
#!/bin/bash
sleep 60
killall MainScriptTest > /dev/null 2>&1
exit 0
Note: the long name for the scripts with mixed capital and lowercase letters (ex.: MainScriptTest) is on purpose, killall is case sensitive and that helps to preclude it from killing something it should not. To be very safe, you might want to even add a token in addition to the longer name, like: MainScriptTest88888 or something like that.
Edit: Thanks to gilez, who suggested the use of the timeout command. If that is available to you on your system, one could do a quick one-liner like this:
timeout 60 bash -c "/folder/MainScriptTest"
Using timeout is convenient. However, if MainScriptTest creates independent child processes (for example by calling: nohup /folder/OtherScript &) then timeout would not kill those child processes, and the exit would not be clean.
The first solution I gave is longer, but it could be customized to kill those child processes (or any other processes you want) by adding them to the MainScriptTest, like for example:
killall OtherScript > /dev/null 2>&1
Found some other way.
result=$( ssh $remote_host /users/jj/test.sh ) & mypid=$!
( sleep 10; kill -9 $mypid ) &
wait $mypid

Terminal Application to Keep Web Server Process Alive

Is there an app that can, given a command and options, execute for the lifetime of the process and ping a given URL indefinitely on a specific interval?
If not, could this be done on the terminal as a bash script? I'm almost positive it's doable through terminal, but am not fluent enough to whip it up within a few minutes.
Found this post that has a portion of the solution, minus the ping bits. ping runs on linux, indefinitely; until it's actively killed. How would I kill it from bash after say, two pings?
General Script
As others have suggested, use this in pseudo code:
execute command and save PID
while PID is active, ping and sleep
exit
This results in following script:
#!/bin/bash
# execute command, use '&' at the end to run in background
<command here> &
# store pid
pid=$!
while ps | awk '{ print $1 }' | grep $pid; do
ping <address here>
sleep <timeout here in seconds>
done
Note that the stuff inside <> should be replaces with actual stuff. Be it a command or an ip address.
Break from Loop
To answer your second question, that depends in the loop. In the loop above, simply track the loop count using a variable. To do that, add a ((count++)) inside the loop. And do this: [[ $count -eq 2 ]] && break. Now the loop will break when we're pinging for a second time.
Something like this:
...
while ...; do
...
((count++))
[[ $count -eq 2 ]] && break
done
ping twice
To ping only a few times, use the -c option:
ping -c <count here> <address here>
Example:
ping -c 2 www.google.com
Use man ping for more information.
Better practice
As hek2mgl noted in a comment below, the current solution may not suffice to solve the problem. While answering the question, the core problem will still persist. To aid to that problem, a cron job is suggested in which a simple wget or curl http request is sent periodically. This results in a fairly easy script containing but one line:
#!/bin/bash
curl <address here> > /dev/null 2>&1
This script can be added as a cron job. Leave a comment if you desire more information how to set such a scheduled job. Special thanks to hek2mgl for analyzing the problem and suggesting a sound solution.
Say you want to start a download with wget and while it is running, ping the url:
wget http://example.com/large_file.tgz & #put in background
pid=$!
while kill -s 0 $pid #test if process is running
do
ping -c 1 127.0.0.1 #ping your adress once
sleep 5 #and sleep for 5 seconds
done
A nice little generic utility for this is Daemonize. Its relevant options:
Usage: daemonize [OPTIONS] path [arg] ...
-c <dir> # Set daemon's working directory to <dir>.
-E var=value # Pass environment setting to daemon. May appear multiple times.
-p <pidfile> # Save PID to <pidfile>.
-u <user> # Run daemon as user <user>. Requires invocation as root.
-l <lockfile> # Single-instance checking using lockfile <lockfile>.
Here's an example of starting/killing in use: flickd
To get more sophisticated, you could turn your ping script into a systemd service, now standard on many recent Linuxes.

nohup doesn't work when used with double-ampersand (&&) instead of semicolon (;)

I have a script that uses ssh to login to a remote machine, cd to a particular directory, and then start a daemon. The original script looks like this:
ssh server "cd /tmp/path ; nohup java server 0</dev/null 1>server_stdout 2>server_stderr &"
This script appears to work fine. However, it is not robust to the case when the user enters the wrong path so the cd fails. Because of the ;, this command will try to run the nohup command even if the cd fails.
The obvious fix doesn't work:
ssh server "cd /tmp/path && nohup java server 0</dev/null 1>server_stdout 2>server_stderr &"
that is, the SSH command does not return until the server is stopped. Putting nohup in front of the cd instead of in front of the java didn't work.
Can anyone help me fix this? Can you explain why this solution doesn't work? Thanks!
Edit: cbuckley suggests using sh -c, from which I derived:
ssh server "nohup sh -c 'cd /tmp/path && java server 0</dev/null 1>master_stdout 2>master_stderr' 2>/dev/null 1>/dev/null &"
However, now the exit code is always 0 when the cd fails; whereas if I do ssh server cd /failed/path then I get a real exit code. Suggestions?
See Bash's Operator Precedence.
The & is being attached to the whole statement because it has a higher precedence than &&. You don't need ssh to verify this. Just run this in your shell:
$ sleep 100 && echo yay &
[1] 19934
If the & were only attached to the echo yay, then your shell would sleep for 100 seconds and then report the background job. However, the entire sleep 100 && echo yay is backgrounded and you're given the job notification immediately. Running jobs will show it hanging out:
$ sleep 100 && echo yay &
[1] 20124
$ jobs
[1]+ Running sleep 100 && echo yay &
You can use parenthesis to create a subshell around echo yay &, giving you what you'd expect:
sleep 100 && ( echo yay & )
This would be similar to using bash -c to run echo yay &:
sleep 100 && bash -c "echo yay &"
Tossing these into an ssh, and we get:
# using parenthesis...
$ ssh localhost "cd / && (nohup sleep 100 >/dev/null </dev/null &)"
$ ps -ef | grep sleep
me 20136 1 0 16:48 ? 00:00:00 sleep 100
# and using `bash -c`
$ ssh localhost "cd / && bash -c 'nohup sleep 100 >/dev/null </dev/null &'"
$ ps -ef | grep sleep
me 20145 1 0 16:48 ? 00:00:00 sleep 100
Applying this to your command, and we get
ssh server "cd /tmp/path && (nohup java server 0</dev/null 1>server_stdout 2>server_stderr &)"
or:
ssh server "cd /tmp/path && bash -c 'nohup java server 0</dev/null 1>server_stdout 2>server_stderr &'"
Also, with regard to your comment on the post,
Right, sh -c always returns 0. E.g., sh -c exit 1 has error code
0"
this is incorrect. Directly from the manpage:
Bash's exit status is the exit status of the last command executed in
the script. If no commands are executed, the exit status is 0.
Indeed:
$ bash -c "true ; exit 1"
$ echo $?
1
$ bash -c "false ; exit 22"
$ echo $?
22
ssh server "test -d /tmp/path" && ssh server "nohup ... &"
Answer roundup:
Bad: Using sh -c to wrap the entire nohup command doesn't work for my purposes because it doesn't return error codes. (#cbuckley)
Okay: ssh <server> <cmd1> && ssh <server> <cmd2> works but is much slower (#joachim-nilsson)
Good: Create a shell script on <server> that runs the commands in succession and returns the correct error code.
The last is what I ended up using. I'd still be interested in learning why the original use-case doesn't work, if someone who understands shell internals can explain it to me!

Resources