I'd like to kill a rosbag instance gracefully via terminal.
Gracefully means in that case, that the rosbag file doesn't have the suffix .active after kill.
so I do the following from terminal to send the recommended SIGINT to rosbag:
$ rosbag record /some/topic &
$ RPID=$!
$ # do some stuff
$ kill -2 $RPID
Unfortunately, the bag remains active and it can happen that not everything was stored to the disk.
However, if I put rosbag into a launch file, it seems to work:
$ roslaunch rosbag_record.launch &
$ LPID=$!
$ # do some stuff
$ kill -2 $LPID
Now the rosbag stays intact and it was stored without the active suffix.
Now the interesting question is, what am I doing wrong in the first case.
I though that killing a launch file, and in this case killing the roscore, raises a ros::shutdown() which causes a SIGINT in all processes.
But the manual way by using kill seems to has a different behavior.
Native signal handeling is not well suported and it is always better to use ROS's intended ways of starting and shutting done nodes, so that the API can keep track.
To end a node gracefully, we assume that a rosbag node with name my_bag was started:
rosbag record -o /file/name /topic __name:=my_bag
Then, the node kann be gracefully killed using the rosnode kill command and the name of the node:
rosnode kill /my_bag
Link for reference
As mentioned in the reference in #Tik0's answer, you can add a trap in bash to catch SIGINT (Ctrl+C) and call rosnode kill from there, eg.:
#!/bin/bash
trap "rosnode kill /bagger" SIGINT
rosbag record /my_topic __name:=bagger &
roslaunch mypackage launch.launch
I wanted to start rosbag recordings from within a launchfile with
<node pkg="rosbag" type="record" name="rosbag_recorder" args="record --all --exclude '/unwanted/(.*)'"/>
and had the issue of .active-files when stopping roslaunch with CTRL+C.
A solution which works well for me is to start the following node in the launchfile, which triggers rosnode kill upon user request by typing "stop":
#!/usr/bin/env python
import rospy
import subprocess
NODENAME = 'keypress_killer'
TAG = NODENAME + ": "
if __name__ == '__main__':
rospy.init_node(NODENAME)
while not rospy.is_shutdown():
try:
rospy.logwarn(TAG + "Plase type quoted \"stop\" to gracefully kill all nodes and the rosbag recording")
input_str = input()
if input_str == "stop":
rospy.logwarn(TAG + "Stopping all nodes and rosbag recording...")
subprocess.check_call(['rosnode', 'kill', '-a']) # gracefully kill the rosbag recorder
else:
raise ValueError
except:
rospy.logerr(TAG + "Unknown command, please type quoted \"stop\" to gracefully kill all nodes and the rosbag recording")
rospy.sleep(0.1)
Starting rosbag node as giving special name because for shutting down easily, and also if you want give a special name for bagfile name;
-a means save all topics, you can give topic name if you want
rosbag record --output-name=bagFileName -a __name:=bagNodeName
Now you can see a running node by the name of 'bagNodeName' if you run rosnode list and also bagFileName.bag.active file in a relative folder
and then for shutting down node and save bag file automatically;
rosnode kill bagNodeName
Now you should see bagFileName.bag file
Related
I am trying to run several roslaunch files, one after the other, from a bash script. However, when the nodes complete execution, they hang with the message:
[grem_node-1] process has finished cleanly
log file: /home/user/.ros/log/956b5e54-75f5-11e9-94f8-a08cfdc04927/grem_node-1*.log
Then I need to Ctrl-C to get killing on exit for all of the nodes launched from the launch file. Is there some way of causing nodes to automatically kill themselves on exit? Because at the moment I need to Ctrl-C every time a node terminates.
My bash script looks like this, by the way:
python /home/user/git/segmentation_plots/scripts/generate_grem_launch.py /home/user/Data2/Coco 0 /home/user/git/Async_CNN/config.txt
source ~/setupgremsim.sh
roslaunch grem_ros grem.launch config:=/home/user/git/Async_CNN/config.txt
source /home/user/catkin_ws/devel/setup.bash
roslaunch rpg_async_cnn_generator conf_coco.launch
The script setupgremsim.sh sources another catkin workspace.
Many thanks!
Thanks all for your advice. What I ended up doing was this; I launched my ROS Nodes from separate python scripts, which I then called from the bash script. In python you are able to terminate child processes with shutdown. So to provide an example for anyone else with this issue:
bash script:
#!/bin/bash
for i in {0..100}
do
echo "========================================================\n"
echo "This is the $i th run\n"
echo "========================================================\n"
source /home/timo/catkin_ws/devel/setup.bash
python planar_launch_generator.py
done
and then inside planar_launch_generator.py:
import roslaunch
import rospy
process_generate_running = True
class ProcessListener(roslaunch.pmon.ProcessListener):
global process_generate_running
def process_died(self, name, exit_code):
global process_generate_running
process_generate_running = False
rospy.logwarn("%s died with code %s", name, exit_code)
def init_launch(launchfile, process_listener):
uuid = roslaunch.rlutil.get_or_generate_uuid(None, False)
roslaunch.configure_logging(uuid)
launch = roslaunch.parent.ROSLaunchParent(
uuid,
[launchfile],
process_listeners=[process_listener],
)
return launch
rospy.init_node("async_cnn_generator")
launch_file = "/home/user/catkin_ws/src/async_cnn_generator/launch/conf_coco.launch"
launch = init_launch(launch_file, ProcessListener())
launch.start()
while process_generate_running:
rospy.sleep(0.05)
launch.shutdown()
Using this method you could source any number of different catkin workspaces and launch any number of launchfiles.
Try to do this
(1) For each launch you put in a separate shell script. So you have N script
In each script, call the launch file in xterm. xterm -e "roslaunch yourfacnylauncher"
(2) Prepare a master script which calling all N child script in the sequence you want it to be and delay you want it to have.
Once it is done, xterm should kill itself.
Edit. You can manually kill one if you know its gonna hang. Eg below
#!/bin/sh
source /opt/ros/kinetic/setup.bash
source ~/catkin_ws/devel/setup.bash
start ROScore using systemd or rc.local using lxtermal or other terminals to avoid accident kill. Then run the part which you think gonna hang or create a problem. Echo->action if necessary
xterm -geometry 80x36+0+0 -e "echo 'uav' | sudo -S dnsmasq -C /dev/null -kd -F 10.5.5.50,10.5.5.100 -i enp59s0 --bind-dynamic" & sleep 15
Stupid OUSTER LIDAR cant auto config like Veloydne and will hang here. other code cant run
killall xterm & sleep 1
Lets just kill it and continuous run other launches
xterm -e "roslaunch '/home/uav/catkin_ws/src/ouster_driver_1.12.0/ouster_ros/os1.launch' os1_hostname:=os1-991907000715.local os1_udp_dest:=10.5.5.1"
I have a program I want to start. Let' say this program will run a while(true)-loop (so it does not terminate. I want to write a bash script which:
Starts the program (./endlessloop &)
Waits 1 second (sleep 1)
Kills the program --> How?
I cannot use $! to get pid from child because server is running a lot of instances concurrently.
Store the PID:
./endlessloop & endlessloop_pid=$!
sleep 1
kill "$endlessloop_pid"
You can also check whether the process is still running with kill -0:
if kill -0 "$endlessloop_pid"; then
echo "Endlessloop is still running"
fi
...and storing the content in a variable means it scales to multiple processes:
endlessloop_pids=( ) # initialize an empty array to store PIDs
./endlessloop & endlessloop_pids+=( "$!" ) # start one in background and store its PID
./endlessloop & endlessloop_pids+=( "$!" ) # start another and store its PID also
kill "${endlessloop_pids[#]}" # kill both endlessloop instances started above
See also BashFAQ #68, "How do I run a command, and have it abort (timeout) after N seconds?"
The ProcessManagement page on the Wooledge wiki also discusses relevant best practices.
You can use the pgrep command for the same:
kill $(pgrep endlessloop)
I have a bash script call run.sh that launches multiple processes
#!/bin/bash
proc1 &
proc2 &
proc3 &
final # this runs until sigterm
When I execute run.sh and I send a SIGTERM to run.sh, I don't think SIGTERM is being sent to final, and I don't think it is being sent to proc1, proc2, and proc3. Note that in this use case this is a docker container which runs run.sh, and running docker stop is the way I'm trying to send SIGTERM.
What would be the easiest way for the bash script to send a sigterm to all of the processes it started? The only way I can think of is by starting final with the & too and then do a while loop in run.sh?
EDIT - I've tried it though, doesn't seem to work:
In run.sh
#!/bin/bash
_term() {
echo "Caught SIGTERM signal!"
}
trap _term SIGTERM
echo "hi"
sleep 100000 &
wait $!
When running docker stop, I never see Caught SIGTERM signal!
You said you run that script in a Docker container. Could you give us more details on how your start the container and how the run.sh is invoked?.
When docker stop is invoked or a direct SIGTERM is received by the container the contained process with PID 1 will receive it. When your run.sh creates child processes that run in background it also has to forward signals to them.
Therefore it is not a good approach to create background child processes in a bash script with &. Using a supervisor would be a good practice as it handles signals properly and forwards them to its child processes without any further scripting needed.
In addition the supervisord should not be started as a shell child process itself. That would happen if you specify this as your container command in your Dockerfile:
CMD /usr/bin/supervisord
Instead it should look like:
CMD ["/usr/bin/supervisord"]
That way the supervisor becomes the root process with PID 1 and will receive all the signals properly and redirects them to its child processes.
Use jobs -p to get the process ids of any background jobs, then pass them to kill.
trap 'kill $(jobs -p)' TERM
proc1 &
proc2 &
proc3 &
wait
Correct, I would collect them all in an array and then send a signal to each one of them when finished. I would use something like awk '{ system("kill -15 " $1)}'.
/bin/sh -version
GNU sh, version 1.14.7(1)
exitfn () {
# Resore signal handling for SIGINT
echo "exiting with trap" >> /tmp/logfile
rm -f /var/run/lockfile.pid # Growl at user,
exit # then exit script.
}
trap 'exitfn; exit' SIGINT SIGQUIT SIGTERM SIGKILL SIGHUP
The above is my function in shell script.
I want to call it in some special conditions...like
when:
"kill -9" fires on pid of this script
"ctrl + z" press while it is running on -x mode
server reboots while script is executing ..
In short, with any kind of interrupt in script, should do some action
eg. rm -f /var/run/lockfile.pid
but my above function is not working properly; it works only for terminal close or "ctrl + c"
Kindly don't suggest to upgrade "bash / sh" version.
SIGKILL cannot be trapped by the trap command, or by any process. It is a guarenteed kill signal, that by it's definition cannot be trapped. Thus upgrading you sh/bash will not work anyway.
You can't trap kill -9 that's the whole point of it, to destroy processes violently that don't respond to other signals (there's a workaround for this, see below).
The server reboot should first deliver a signal to your script which should be caught with what you have.
As to the CTRL-Z, that also gives you a signal, SIGSTOP from memory, so you may want to add that. Though that wouldn't normally be a reason to shut down your process since it may be then put into the background and restarted (with bg).
As to what do do for those situations where your process dies without a catchable signal (like the -9 case), the program should check for that on startup.
By that, I mean lockfile.pid should store the actual PID of the process that created it (by using echo $$ >/var/run/myprog_lockfile.pid for example) and, if you try to start your program, it should check for the existence of that process.
If the process doesn't exist, or it exists but isn't the right one (based on name usually), your new process should delete the pidfile and carry on as if it was never there. If the old process both exists and is the right one, your new process should log a message and exit.
I would like to do the following:
I want to link a process A to a file F, so:
If F dissapears A crashes.
F will only dissapear when A finishes.
Is this possible? Thank you very much.
You should not avoid PIDs. They are process identifiers, and meant to be used.
Bash automatically monitors child processes it starts. The most recent background process id is maintained in $!. Bash also supports job controls using '%n' syntax.
You can trap child procs status changes with trap SIGCHLD, and you can "wait" for one or all child processes to complete with the wait command.
Here is a rough approximation of your two process monitoring, which consists of "job1" and "job2" being started the the sample script:
job1 & # start job1 in background
j1pid=$! # get its process id
job2 & # start job2 in background
j2pid=$1 # get its process id
trap 'err=1' ERR # trap all errors
err=
wait $j1pid # wait for job1 to complete
# at this point job1 could have completed normally,
# or either process could have had an error
trap - ERR # revert to "normal" handling of most errors
# kill the processes nicely, or abruptly
# kill -TERM sends the TERM signal to the process, which it can trap
# and do whatever pre-exit process is needed.
# kill -9 cannot be trapped.
for pid in $j1pid $j2pid ; do
kill -TERM $pid 2>/dev/null || kill -9 $pid
done
You already have a file with almost this property on Linux. If you created a process, the /proc/procNum will exist while the process is alive. As an example, if your process number is 1050, the /proc/1050 will exist until the process die. I do not know if removing this file will kill the process but you can try to tie both together.