I have a script which starts many processes in background and uses nohup to make sure these processes keeps on running -
nohup "./$__service_script1.pl" $__service_args < /dev/null > /var/log/$__service_name.log 2>&1 &
The problem is its important for me to make sure that processes starts in the order of their invocation. Is their a way to wait until the process has definitely started before attempting to start another process?
I tried wait, but it waits till the process is finished, I just want to make sure that process has started. Probably the simplest solution would be to put sleep for a few seconds in between processes, is there a better solution ?
Thanks
It depends on what you mean by "definitely started". If you mean that fork(2) has completed and the new process exists, then each process is started by the time nohup returns. A new process has been created.
The problem you are running into is that there is no guarantee how long the nohup'ed process gets to run before the shell returns. When the process you start is "definitely started" depends on what the process does for initialization. If you do not have source of the applications or are not able to modify them for some other reason, you will be limited to looking at their output. Many daemons will output a log message at various stages of their initialization. You can modify your script to
Look for a log file, and create an empty one if it does not exist
Open the log file for reading (at the end to avoid false messages from previous invocations), watching for the log message that indicates the process has started,
Start your process with nohup,
Wait for your log file watcher
In bash, it would look something like this might work (this code is completely untested):
log=<path to log file>
msg=<message service prints when it is ready>
svc=<path to service>
# Create log file if it does not exist
if [ ! -f "$log" ] ; then
echo > "$log"
fi
# watch for message to appear on a single line in the log file
tail -0 -f "$log" | egrep "$msg" | head -1 &
ready_pid=$!
# Start the service
nohup "$svc" < /dev/null >> "$log" 2>&1 &
# Wait for the message
wait $ready_pid
You want to start watching the log file before forking the service, because otherwise, the message might go by in the log before the script starting the service can attach to the log file.
Related
I sometimes launch long running tasks on my server and want the server to do something after those tasks finish (usually shut down). If there was only one task, I could simply type the next command into the window running the task, then bash will run it after the current one finishes. But what if there was multiple processes that I want to wait on?
In my workflow, the different tasks are running in different panes on tmux, so I cannot directly use wait since the processes I want to wait for are not child processes in one particular pane.
I have included a possible approach as an answer below.
This question's answer offers a related solution:
tail --pid=$pid -f /dev/null
However, that particular answer can only handle waiting for one process, but we can extend it using wait for multiple processes, then run our own command on completion:
tail --pid=$pid1 -f /dev/null &
tail --pid=$pid2 -f /dev/null &
tail --pid=$pid3 -f /dev/null &
tail --pid=$pid4 -f /dev/null &
wait; <your-command-here>
Ive got a script that takes a quite a long time to run, as it has to handle many thousands of files. I want to make this script as fool proof as possible. To this end, I want to check if the user ran the script using nohup and '&'. E.x.
me#myHost:/home/me/bin $ nohup doAlotOfStuff.sh &. I want to make 100% sure the script was run with nohup and '&', because its a very painful recovery process if the script dies in the middle for whatever reason.
How can I check those two key paramaters inside the script itself? and if they are missing, how can I stop the script before it gets any farther, and complain to the user that they ran the script wrong? Better yet, is there way I can force the script to run in nohup &?
Edit: the server enviornment is AIX 7.1
The ps utility can get the process state. The process state code will contain the character + when running in foreground. Absence of + means code is running in background.
However, it will be hard to tell whether the background script was invoked using nohup. It's also almost impossible to rely on the presence of nohup.out as output can be redirected by user elsewhere at will.
There are 2 ways to accomplish what you want to do. Either bail out and warn the user or automatically restart the script in background.
#!/bin/bash
local mypid=$$
if [[ $(ps -o stat= -p $mypid) =~ "+" ]]; then
echo Running in foreground.
exec nohup $0 "$#" &
exit
fi
# the rest of the script
...
In this code, if the process has a state code +, it will print a warning then restart the process in background. If the process was started in the background, it will just proceed to the rest of the code.
If you prefer to bailout and just warn the user, you can remove the exec line. Note that the exit is not needed after exec. I left it there just in case you choose to remove the exec line.
One good way to find if a script is logging to nohup, is to first check that the nohup.out exists, and then to echo to it and ensure that you can read it there. For example:
echo "complextag"
if ( $(cat nohup.out | grep "complextag" ) != "complextag" );then
# various commands complaining to the user, then exiting
fi
This works because if the script's stdout is going to nohup.out, where they should be going (or whatever out file you specified), then when you echo that phrase, it should be appended to the file nohup.out. If it doesn't appear there, then the script was nut run using nohup and you can scold them, perhaps by using a wall command on a temporary broadcast file. (if you want me to elaborate on that I can).
As for being run in the background, if it's not running you should know by checking nohup.
I have a bash script with a loop that calls a hard calculation routine every iteration. I use the results from every calculation as input to the next. I need make bash stop the script reading until every calculation is finished.
for i in $(cat calculation-list.txt)
do
./calculation
(other commands)
done
I know the sleep program, and i used to use it, but now the time of the calculations varies greatly.
Thanks for any help you can give.
P.s>
The "./calculation" is another program, and a subprocess is opened. Then the script passes instantly to next step, but I get an error in the calculation because the last is not finished yet.
If your calculation daemon will work with a precreated empty logfile, then the inotify-tools package might serve:
touch $logfile
inotifywait -qqe close $logfile & ipid=$!
./calculation
wait $ipid
(edit: stripped a stray semicolon)
if it closes the file just once.
If it's doing an open/write/close loop, perhaps you can mod the daemon process to wrap some other filesystem event around the execution? `
#!/bin/sh
# Uglier, but handles logfile being closed multiple times before exit:
# Have the ./calculation start this shell script, perhaps by substituting
# this for the program it's starting
trap 'echo >closed-on-calculation-exit' 0 1 2 3 15
./real-calculation-daemon-program
Well, guys, I've solved my problem with a different approach. When the calculation is finished a logfile is created. I wrote then a simple until loop with a sleep command. Although this is very ugly, it works for me and it's enough.
for i in $(cat calculation-list.txt)
do
(calculations routine)
until [[ -f $logfile ]]; do
sleep 60
done
(other commands)
done
Easy. Get the process ID (PID) via some awk magic and then use wait too wait for that PID to end. Here are the details on wait from the advanced Bash scripting guide:
Suspend script execution until all jobs running in background have
terminated, or until the job number or process ID specified as an
option terminates. Returns the exit status of waited-for command.
You may use the wait command to prevent a script from exiting before a
background job finishes executing (this would create a dreaded orphan
process).
And using it within your code should work like this:
for i in $(cat calculation-list.txt)
do
./calculation >/dev/null 2>&1 & CALCULATION_PID=(`jobs -l | awk '{print $2}'`);
wait ${CALCULATION_PID}
(other commands)
done
I have a task that is very well inside of a bash for loop. The situation is though, that a few of the iterations seem to not terminate. What I'm looking for is a way to introduce a timeout that if that iteration of command hasn't terminated after e.g. two hours it will terminate, and move on to the next iteration.
Rough outline:
for somecondition; do
while time-run(command) < 2h do
continue command
done
done
One (tedious) way is to start the process in the background, then start another background process that attempts to kill the first one after a fixed timeout.
timeout=7200 # two hours, in seconds
for somecondition; do
command & command_pid=$!
( sleep $timeout & wait; kill $command_pid 2>/dev/null) & sleep_pid=$!
wait $command_pid
kill $sleep_pid 2>/dev/null # If command completes prior to the timeout
done
The wait command blocks until the original command completes, whether naturally or because it was killed after the sleep completes. The wait immediately after sleep is used in case the user tries to interrupt the process, since sleep ignores most signals, but wait is interruptible.
If I'm understanding your requirement properly, you have a process that needs to run, but you want to make sure that if it gets stuck it moves on, right? I don't know if this will fully help you out, but here is something I wrote a while back to do something similar (I've since improved this a bit, but I only have access to a gist at present, I'll update with the better version later).
#!/bin/bash
######################################################
# Program: logGen.sh
# Date Created: 22 Aug 2012
# Description: parses logs in real time into daily error files
# Date Updated: N/A
# Developer: #DarrellFX
######################################################
#Prefix for pid file
pidPrefix="logGen"
#output direcory
outDir="/opt/Redacted/logs/allerrors"
#Simple function to see if running on primary
checkPrime ()
{
if /sbin/ifconfig eth0:0|/bin/grep -wq inet;then isPrime=1;else isPrime=0;fi
}
#function to kill previous instances of this script
killScript ()
{
/usr/bin/find /var/run -name "${pidPrefix}.*.pid" |while read pidFile;do
if [[ "${pidFile}" != "/var/run/${pidPrefix}.${$}.pid" ]];then
/bin/kill -- -$(/bin/cat ${pidFile})
/bin/rm ${pidFile}
fi
done
}
#Check to see if primary
#If so, kill any previous instance and start log parsing
#If not, just kill leftover running processes
checkPrime
if [[ "${isPrime}" -eq 1 ]];then
echo "$$" > /var/run/${pidPrefix}.$$.pid
killScript
commands && commands && commands #Where the actual command to run goes.
else
killScript
exit 0
fi
I then set this script to run on cron every hour. Every time the script is run, it
creates a lock file named after a variable that describes the script that contains the pid of that instance of the script
calls the function killScript which:
uses the find command to find all lock files for that version of the script (this lets more than one of these scripts be set to run in cron at once, for different tasks). For each file it finds, it kills the processes of that lock file and removes the lock file (it automatically checks that it's not killing itself)
Starts doing whatever it is I need to run and not get stuck (I've omitted that as it's hideous bash string manipulation that I've since redone in python).
If this doesn't get you squared let me know.
A few notes:
the checkPrime function is poorly done, and should either return a status, or just exit the script itself
there are better ways to create lock files and be safe about it, but this has worked for me thus far (famous last words)
I have a previously running process (process1.sh) that is running in the background with a PID of 1111 (or some other arbitrary number). How could I send something like command option1 option2 to that process with a PID of 1111?
I don't want to start a new process1.sh!
Named Pipes are your friend. See the article Linux Journal: Using Named Pipes (FIFOs) with Bash.
Based on the answers:
Writing to stdin of background process
Accessing bash command line args $# vs $*
Why my named pipe input command line just hangs when it is called?
Can I redirect output to a log file and background a process at the same time?
I wrote two shell scripts to communicate with my game server.
This first script is run when computer start up. It does start the server and configure it to read/receive my commands while it run in background:
start_czero_server.sh
#!/bin/sh
# Go to the game server application folder where the game application `hlds_run` is
cd /home/user/Half-Life
# Set up a pipe named `/tmp/srv-input`
rm /tmp/srv-input
mkfifo /tmp/srv-input
# To avoid your server to receive a EOF. At least one process must have
# the fifo opened in writing so your server does not receive a EOF.
cat > /tmp/srv-input &
# The PID of this command is saved in the /tmp/srv-input-cat-pid file
# for latter kill.
#
# To send a EOF to your server, you need to kill the `cat > /tmp/srv-input` process
# which PID has been saved in the `/tmp/srv-input-cat-pid file`.
echo $! > /tmp/srv-input-cat-pid
# Start the server reading from the pipe named `/tmp/srv-input`
# And also output all its console to the file `/home/user/Half-Life/my_logs.txt`
#
# Replace the `./hlds_run -console -game czero +port 27015` by your application command
./hlds_run -console -game czero +port 27015 > my_logs.txt 2>&1 < /tmp/srv-input &
# Successful execution
exit 0
This second script it just a wrapper which allow me easily to send commands to the my server:
send.sh
half_life_folder="/home/jack/Steam/steamapps/common/Half-Life"
half_life_pid_tail_file_name=hlds_logs_tail_pid.txt
half_life_pid_tail="$(cat $half_life_folder/$half_life_pid_tail_file_name)"
if ps -p $half_life_pid_tail > /dev/null
then
echo "$half_life_pid_tail is running"
else
echo "Starting the tailing..."
tail -2f $half_life_folder/my_logs.txt &
echo $! > $half_life_folder/$half_life_pid_tail_file_name
fi
echo "$#" > /tmp/srv-input
sleep 1
exit 0
Now every time I want to send a command to my server I just do on the terminal:
./send.sh mp_timelimit 30
This script allows me to keep tailing the process on your current terminal, because every time I send a command, it checks whether there is a tail process running in background. If not, it just start one and every time the process sends outputs, I can see it on the terminal I used to send the command, just like for the applications you run appending the & operator.
You could always keep another open terminal open just to listen to my server server console. To do it just use the tail command with the -f flag to follow my server console output:
./tail -f /home/user/Half-Life/my_logs.txt
If you don't want to be limited to signals, your program must support one of the Inter Process Communication methods. See the corresponding Wikipedia article.
A simple method is to make it listen for commands on a Unix domain socket.
For how to send commands to a server via a named pipe (fifo) from the shell see here:
Redirecting input of application (java) but still allowing stdin in BASH
How do I use exec 3>myfifo in a script, and not have echo foo>&3 close the pipe?
You can use the bash's coproc comamnd. (avaliable only in 4.0+) - it's like ksh's |&
check this for examples http://wiki.bash-hackers.org/syntax/keywords/coproc
you can't send new args to a running process.
But if you are implementing this process or its a process that can take the args from a pipe, then the other answer would help.