I'm have a tad bit of difficulty with developing bash based deployment scripts for a pipeline I want to run on an OpenStack VM. There are 4 scripts in total:
head_node.sh - launches the vm and attaches appropriate disk storage to the VM. Once that's completed, it runs the scripts (2 and 3) sequentially by passing a command through ssh to the VM.
install.sh - VM-side, installs all of the appropriate software needed by the pipeline.
run.sh - VM-side, mounts storage on the VM and downloads raw data from object storage. It then runs the final script, but does so by detaching the process from the shell created by ssh using nohup ./pipeline.sh &. The reason I want to detach from the shell is that the next portion is largely just compute and may take days to finish. Therefore, the user shouldn't have to keep the shell open that long and it should just run in the background.
pipeline.sh - VM-side, essentially a for loop that iterates through a list of files, and sequential runs commands on those and intermediate files. The result are analysed which are then staged back to the object storage. The VM then essentially tells the head node to kill it.
Now I'm running into a problem with nohup. If I launch the pipeline.sh script normally (i.e. without nohup) and keep it attached to that shell, everything runs smoothly. However, if I detach the script, it errors out after the first command in the first iteration of the for loop. Am I thinking about this the wrong way? What's the correct way to do this?
So this is how it looks:
... launched VM etc
ssh $vm_ip './install.sh'
ssh $vm_ip './run.sh'
exit 0
install.sh - omitted - not important for the problem
... mounts storage downloads appropriate files
nohup ./pipeline.sh > log &
exit 0
for f in $(find . -name '*ext')
process1 $f
process2 $f
... stage files to object storage, unmount disks, additional cleanups
ssh $head_node 'nova delete $vm_hash'
exit 0

Since I'm evoking the run.sh script from an ssh instance, subprocesses launched from the script (namely pipeline.sh) will not properly detach from the shell and will error out on termination of the ssh instance evoking run.sh. The pipeline.sh script can be properly detached by calling it from the head node, e.g., nohup ssh $vm_ip './pipeline.sh' &, this will keep the session alive until the end of the pipeline.


Can't terminate node(js) process without terminating ssh server in docker container

I'm using a Dockerfile that ends with a CMD ["/start.sh"]:
service ssh start
/usr/bin/node /myApp/app.js
if for some reason i need to kill the node process, the ssh server is being closed as well (forces me to reboot the container to reconnect).
Any simple way to avoid this behavior?
Thank You.
The container exits as soon as main process of the container exits. In your case, the main process inside the container is start.sh shell script. The start.sh shell script is starting the ssh service and then running the nodejs process as child process. Once the nodejs process dies, the shell script exits as well and so the container exits. So what you can do is to put the nodejs process in background.
service ssh start
/usr/bin/node /myApp/app.js &
# Need the following infinite loop as the shell script should not exit
while do:
sleep 2
I DO NOT recommend this approach though. You should have only a single process per container. Read the following answers to understand why -
Running multiple applications in one docker container
If you still want to run multiple processes inside container, there are better ways to do it like using supervisord - https://docs.docker.com/config/containers/multi-service_container/

jobs command result is empty when process is run through script

I need to run rsync in background through shell script but once it has started, I need to monitor the status of that jobs through shell.
jobs command return empty when its run in shell after the script exits. ps -ef | grep rsync shows that the rsync is still running.
I can check the status through script but I need to run the script multiple times so it uses a different ip.txt file to push. So I can't have the script running to check jobs status.
Here is the script:
for i in `cat $ip.txt`; do
rsync -avzh $directory/ user#"$i":/cygdrive/c/test/$directory 2>&1 > /dev/null &
jobs; #shows the jobs status while in the shell script.
exit 1
Output of jobs command is empty after the shell script exits:
root#host001:~# jobs
What could be the reason and how could I get the status of jobs while the rsync is running in background? I can't find an article online related to this.
Since your shell (the one from which you execute jobs) did not start rsync, it doesn't know anything about it. There are different approaches to fixing that, but it boils down to starting the background process from your shell. For example, you can start the script you have using the source BASH command instead of executing it in a separate process. Of course, you'd have to remove the exit 1 at the end, because that exits your shell otherwise.

Automatically terminate all nodes after calling roslaunch

I am trying to run several roslaunch files, one after the other, from a bash script. However, when the nodes complete execution, they hang with the message:
[grem_node-1] process has finished cleanly
log file: /home/user/.ros/log/956b5e54-75f5-11e9-94f8-a08cfdc04927/grem_node-1*.log
Then I need to Ctrl-C to get killing on exit for all of the nodes launched from the launch file. Is there some way of causing nodes to automatically kill themselves on exit? Because at the moment I need to Ctrl-C every time a node terminates.
My bash script looks like this, by the way:
python /home/user/git/segmentation_plots/scripts/generate_grem_launch.py /home/user/Data2/Coco 0 /home/user/git/Async_CNN/config.txt
source ~/setupgremsim.sh
roslaunch grem_ros grem.launch config:=/home/user/git/Async_CNN/config.txt
source /home/user/catkin_ws/devel/setup.bash
roslaunch rpg_async_cnn_generator conf_coco.launch
The script setupgremsim.sh sources another catkin workspace.
Many thanks!
Thanks all for your advice. What I ended up doing was this; I launched my ROS Nodes from separate python scripts, which I then called from the bash script. In python you are able to terminate child processes with shutdown. So to provide an example for anyone else with this issue:
bash script:
for i in {0..100}
echo "========================================================\n"
echo "This is the $i th run\n"
echo "========================================================\n"
source /home/timo/catkin_ws/devel/setup.bash
python planar_launch_generator.py
and then inside planar_launch_generator.py:
import roslaunch
import rospy
process_generate_running = True
class ProcessListener(roslaunch.pmon.ProcessListener):
global process_generate_running
def process_died(self, name, exit_code):
global process_generate_running
process_generate_running = False
rospy.logwarn("%s died with code %s", name, exit_code)
def init_launch(launchfile, process_listener):
uuid = roslaunch.rlutil.get_or_generate_uuid(None, False)
launch = roslaunch.parent.ROSLaunchParent(
return launch
launch_file = "/home/user/catkin_ws/src/async_cnn_generator/launch/conf_coco.launch"
launch = init_launch(launch_file, ProcessListener())
while process_generate_running:
Using this method you could source any number of different catkin workspaces and launch any number of launchfiles.
Try to do this
(1) For each launch you put in a separate shell script. So you have N script
In each script, call the launch file in xterm. xterm -e "roslaunch yourfacnylauncher"
(2) Prepare a master script which calling all N child script in the sequence you want it to be and delay you want it to have.
Once it is done, xterm should kill itself.
Edit. You can manually kill one if you know its gonna hang. Eg below
source /opt/ros/kinetic/setup.bash
source ~/catkin_ws/devel/setup.bash
start ROScore using systemd or rc.local using lxtermal or other terminals to avoid accident kill. Then run the part which you think gonna hang or create a problem. Echo->action if necessary
xterm -geometry 80x36+0+0 -e "echo 'uav' | sudo -S dnsmasq -C /dev/null -kd -F, -i enp59s0 --bind-dynamic" & sleep 15
Stupid OUSTER LIDAR cant auto config like Veloydne and will hang here. other code cant run
killall xterm & sleep 1
Lets just kill it and continuous run other launches
xterm -e "roslaunch '/home/uav/catkin_ws/src/ouster_driver_1.12.0/ouster_ros/os1.launch' os1_hostname:=os1-991907000715.local os1_udp_dest:="

run script at termination of process

I'm computing on a AWS EC2 type environment. I have a script that runs on the headnode, then executes a shell command on a VM which it detachs from the headnode parent shell.
nohup ssh -i vm-key ubuntu#vm-ip './vm_script.sh' &
exit 0
The reason I do it like this, is that the vm_script takes a very long time to finish. So I want to detach it from the headnode shell, and have it run in the background of the VM.
ssh -i head-node-key user#head-node-ip './delete-vm.sh'
exit 0
Once the VM completes the task it sends a command back to the headnode to delete the VM.
Is there a more reliable way to execute the final script on the headnode following completion of the script on the VM?
E.g., After I launch the vm_script.sh, could I launch another script on the headnode that waits for the completion of the PID on the VM?
Assuming you trust in the network link, the more general reliable way is the simpler way. I would rewrite headnode.sh as presented below:
ssh -i vm-key ubuntu#vm-ip './vm_script.sh' && ./delete-vm.sh
The script above waits the termination of vm_script.sh and then invokes ./delete-vm.sh if no error code is returned by vm_script.sh.
If you don't want to be stuck in the shell waiting for the termination of headnode.sh, just call it using nohup ./headnode.sh &

How to run shell script on VM indefinitely?

I have a VM that I want running indefinitely. The server is always running but I want the script to keep running after I log out. How would I go about doing so? Creating a cron job?
In general the following steps are sufficient to convince most Unix shells that the process you're launching should not depend on the continued existence of the shell:
run the command under nohup
run the command in the background
redirect all file descriptors that normally point to the terminal to other locations
So, if you want to run command-name, you should do it like so:
nohup command-name >/dev/null 2>/dev/null </dev/null &
This tells the process that will execute command-name to send all stdout and stderr to nowhere (instead of to your terminal) and also to read stdin from nowhere (instead of from your terminal). Of course if you actually have locations to write to/read from, you can certainly use those instead -- anything except the terminal is fine:
nohup command-name >outputFile 2>errorFile <inputFile &
See also the answer in Petur's comment, which discusses this issue a fair bit.
