when reading docker events, how to quit docker events instead of entire program? - bash

The requirement is inside a big while loop, given a specific docker container ID, I need to monitor the docker stop event using docker event command. When I found the event, I need to quit the docker event read.
The simplified bash script I wrote is as follows:
#! /bin/bash
while true
do
## other logics
docker events --filter='container=...' --filter='event=stop' | while read event
do
echo $event
break
done
echo "got here"
## other logics
done
So in one bash session, I would run this script, in another bash session, I would type the following command:
docker container stop cassandra-1
The problem is when I execute this bash program, I can capture the stop event, and print the event, but this command will read event repeated. How to quit the docker events and print "got here" ?
I've spent lots of time searching solution online, but can't find any good approach. I once considered using break, but it doesn't work, and I also considered kill -9 $$, but I will quit the entire script program. But I only need to quit the docker event instead of big while loop.
any good advice? Thanks so much!

I found the solution. Key point is to know how docker event command actually work. Based on my research, docker event will generate a process and stuck in the foreground process. You should CTRL+C to quit (refer to: https://docs.docker.com/engine/reference/commandline/events). So you can not use break to get out of it. The approach is to execute docker event command in the background job and kill it when I found the event.
the code is as follows:
(docker events --filter 'container='"$container_id"'' --filter 'event=stop' &) | while read event
do
# kill this backgroud process
pkill -f "docker event.*stop"
done

Related

call a script automatically in container before docker stops the container

I want a custom bash script in the container that is called automatically before the container stops (docker stop or ctrl + c).
According to this docker doc and multiple StackOverflow threads, I need to catch the SIGTERM signal in the container and then run my custom script when the event appears. As I know SIGTERM can be only used from a root process with PID 1.
Relevand part of my Dockerfile:
...
COPY container-scripts/entrypoint.sh /
ENTRYPOINT ["/entrypoint.sh"]
I use [] to define the entrypoint and as I know this will run my script directly, without having a /bin/sh -c wrapper (PID 1), and when the script eventually exec another process, that process becomes the main process and will receive the docker stop signal.
entrypoint.sh:
...
# run the external bash script if it exists
BOOT_SCRIPT="/boot.sh"
if [ -f "$BOOT_SCRIPT" ]; then
printf ">> executing the '%s' script\n" "$BOOT_SCRIPT"
source "$BOOT_SCRIPT"
fi
# start something here
...
The boot.sh is used by child containers to execute something else that the child container wants. Everything is fine, my containers work like a charm.
ps axu in a child container:
PID USER TIME COMMAND
1 root 0:00 {entrypoint.sh} /bin/bash /entrypoint.sh
134 root 0:25 /usr/lib/jvm/java-17-openjdk/bin/java -server -D...
...
421 root 0:00 ps axu
Before stopping the container I need to run some commands automatically so I created a shutdown.sh bash script. This script works fine and does what I need. But I execute the shutdown script manually this way:
$ docker exec -it my-container /bin/bash
# /shutdown.sh
# exit
$ docker container stop my-container
I would like to automate the execution of the shutdown.sh script.
I tried to add the following to the entrypoint.sh but it does not work:
trap "echo 'hello SIGTERM'; source /shutdown.sh; exit" SIGTERM
What is wrong with my code?
Your help and comments guided me in the right direction.
I went through again the official documentations here, here, and here and finally I found what was the problem.
The issue was the following:
My entrypoint.sh script, which kept alive the container executed the following command at the end:
# start the ssh server
ssh-keygen -A
/usr/sbin/sshd -D -e "$#"
The -D option runs the ssh daemon in a NOT detach mode and sshd does not become a daemon. Actually, that was my intention, this is the way how I kept alive the container.
But this foreground process prevented to be executed properly the trap command. I changed the way how I started the sshd app and now it runs as a normal background process.
Then, I added the following command to keep alive my docker container (this is a recommended best practice):
tail -f /dev/null
But of course, the same issue appeared. Tail runs as a foreground process and the trap command does not do its job.
The only way how I can keep alive the container and let the entrypoint.sh runs as a foreign process in docker is the following:
while true; do
sleep 1
done
This way the trap command works fine and my bash function that handles the SIGINT, etc. signals runs properly when the time comes.
But honestly, I do not like this solution. This endless loop with a sleep looks ugly, but I have no idea at the moment how to manage it in a nice way :(
But this is another question that not belongs to this thread (but could be great if you can suggest my a better solution).

Start systemctl from a Bash script and don't wait for it

I need to call systemctl start myservice near the end of a Bash script, but I really don't care about whether it will be successful or when it intends to return. I just need to start the action. It's others' task to monitor the status of that service. My script should return as quickly as possible, no matter whether that service has completed starting, I'm not depending on that.
My first thought was to use something like this:
# do other work
systemctl start myservice &
echo "done"
# end of script
But I've read that this is problematic with signals or in non-interactive environments, where my script is usually called. So I read on and found the nohup command, but that seems to write output files anywhere and might hang if you don't redirect stdin from /dev/null, they say.
So I still don't know how to do this correctly. I'm open for a generic way to start-and-forget any process from a Bash script, or for systemctl specifically as this will be my only use case for now.
I found a pretty easy solution to this:
systemctl start --no-block myservice
The --no-block option can be used for starting, stopping etc. and it won't wait for the actual process to finish. More details in the manpage of systemctl.
If you simply want to start systemctl and you don't want to wait for it, use exec to replace the current process with the systemctl call. For example, instead of backgrounding the process, simply use:
exec systemctl ....
You may want to include the --no-pager option to ensure that the process isn't piped to a pager which would block waiting for user input, e.g.
exec systemctl --no-pager ....
Of course your echo "done" will never be reached, but that wasn't pertinent to your script.

Automatically terminate all nodes after calling roslaunch

I am trying to run several roslaunch files, one after the other, from a bash script. However, when the nodes complete execution, they hang with the message:
[grem_node-1] process has finished cleanly
log file: /home/user/.ros/log/956b5e54-75f5-11e9-94f8-a08cfdc04927/grem_node-1*.log
Then I need to Ctrl-C to get killing on exit for all of the nodes launched from the launch file. Is there some way of causing nodes to automatically kill themselves on exit? Because at the moment I need to Ctrl-C every time a node terminates.
My bash script looks like this, by the way:
python /home/user/git/segmentation_plots/scripts/generate_grem_launch.py /home/user/Data2/Coco 0 /home/user/git/Async_CNN/config.txt
source ~/setupgremsim.sh
roslaunch grem_ros grem.launch config:=/home/user/git/Async_CNN/config.txt
source /home/user/catkin_ws/devel/setup.bash
roslaunch rpg_async_cnn_generator conf_coco.launch
The script setupgremsim.sh sources another catkin workspace.
Many thanks!
Thanks all for your advice. What I ended up doing was this; I launched my ROS Nodes from separate python scripts, which I then called from the bash script. In python you are able to terminate child processes with shutdown. So to provide an example for anyone else with this issue:
bash script:
#!/bin/bash
for i in {0..100}
do
echo "========================================================\n"
echo "This is the $i th run\n"
echo "========================================================\n"
source /home/timo/catkin_ws/devel/setup.bash
python planar_launch_generator.py
done
and then inside planar_launch_generator.py:
import roslaunch
import rospy
process_generate_running = True
class ProcessListener(roslaunch.pmon.ProcessListener):
global process_generate_running
def process_died(self, name, exit_code):
global process_generate_running
process_generate_running = False
rospy.logwarn("%s died with code %s", name, exit_code)
def init_launch(launchfile, process_listener):
uuid = roslaunch.rlutil.get_or_generate_uuid(None, False)
roslaunch.configure_logging(uuid)
launch = roslaunch.parent.ROSLaunchParent(
uuid,
[launchfile],
process_listeners=[process_listener],
)
return launch
rospy.init_node("async_cnn_generator")
launch_file = "/home/user/catkin_ws/src/async_cnn_generator/launch/conf_coco.launch"
launch = init_launch(launch_file, ProcessListener())
launch.start()
while process_generate_running:
rospy.sleep(0.05)
launch.shutdown()
Using this method you could source any number of different catkin workspaces and launch any number of launchfiles.
Try to do this
(1) For each launch you put in a separate shell script. So you have N script
In each script, call the launch file in xterm. xterm -e "roslaunch yourfacnylauncher"
(2) Prepare a master script which calling all N child script in the sequence you want it to be and delay you want it to have.
Once it is done, xterm should kill itself.
Edit. You can manually kill one if you know its gonna hang. Eg below
#!/bin/sh
source /opt/ros/kinetic/setup.bash
source ~/catkin_ws/devel/setup.bash
start ROScore using systemd or rc.local using lxtermal or other terminals to avoid accident kill. Then run the part which you think gonna hang or create a problem. Echo->action if necessary
xterm -geometry 80x36+0+0 -e "echo 'uav' | sudo -S dnsmasq -C /dev/null -kd -F 10.5.5.50,10.5.5.100 -i enp59s0 --bind-dynamic" & sleep 15
Stupid OUSTER LIDAR cant auto config like Veloydne and will hang here. other code cant run
killall xterm & sleep 1
Lets just kill it and continuous run other launches
xterm -e "roslaunch '/home/uav/catkin_ws/src/ouster_driver_1.12.0/ouster_ros/os1.launch' os1_hostname:=os1-991907000715.local os1_udp_dest:=10.5.5.1"

How can I gracefully recover from an attached Docker container terminating?

Say I run this Docker command in one Terminal window:
$ docker run --name stackoverflow --rm ubuntu /bin/bash -c "sleep 5"
And before it exits I run this in a second Terminal window:
$ docker run -it --rm --pid=container:stackoverflow terencewestphal/htop
I'll successfully see htop running in the second container, displaying the bash sleep process running. So far so good.
After 5 seconds, the first container will exit with code 0. All good.
At this time, the second container will exit with code 137 (SIGILL). This also makes sense to me since the second container is just attached to the first one.
The problem is that this messes up macOS's Terminal.app's state:
The Terminal's cursor disappears.
Clicking the Terminal window causes mouse location characters to be entered as input.
I'm hoping to find a way to avoid messing up Terminal.app state. Any suggestions?
You can't avoid such behaviour, because it is the htop duty to setup the terminal state after its termination, but it can't do it when terminated with SIGKILL. However, you can fix this terminal window yourself with the reset command, which is intended to initialize the terminal state.
About the "attached" container:
The --pid=container:<name> option means that the new container would be run in the PID namespace of first container and as the pid_namespaces(7) man page says:
If the "init" process of a PID namespace terminates, the kernel
terminates all of the processes in the namespace via a SIGKILL signal.

Preferred way to terminate `ssh -N` in background using bash?

I've started ssh -N <somehost> & in a bash script (to create a tunnel), and the connection persists after the script ends, and I see with ps that the ssh process has detached.
I am currently killing the background job with kill jobs -p, but is there a better way to do that?
Do you manually end your script?
if so:
Try to catch the QUIT signal (or others) inside your script (use the
trap builtin command I think). Then kill ssh.
else:
Kill ssh at the end of your script.

Resources