I'm trying to configure logrotate with docker container. I'm running httpd as background process in docker container, and after logrotate I need to reload it to use new log files. I don't want to restart the container because of possible downtime. Sending SIGHUP with docker kill --signal=HUP <container> is not working, as my entrypoint is bash script which does not handle signals. I tried to do it like this in logrotate config:
...
sharedscripts
postrotate
service httpd reload > /dev/null 2>/dev/null || true
docker exec some-container kill -HUP $(ps -e | awk '{print $1}')>>/tmp/exec-out.log 2>>/tmp/exec-out.log || true
endscript
but I got
kill: sending signal to 30 failed: No such process
kill: sending signal to 31 failed: No such process
kill: sending signal to 32 failed: No such process
kill: sending signal to 33 failed: No such process
kill: sending signal to 34 failed: No such process
kill: sending signal to 35 failed: No such process
kill: sending signal to 36 failed: No such process
kill: sending signal to 37 failed: No such process
kill: sending signal to 38 failed: No such process
I'm quite new to docker and linux and I don't really understand why docker gets process ids that do not exist.
EDIT: I also would not like to change the bash script to trap SIGHUP if possible, but instead solve the problem in logrotate config.
I believe the $(ps -e | awk '{print $1}')>>/tmp/exec-out.log 2>>/tmp/exec-out.log || true is being run in the context of the host not the docker and so gets the wrong pid as you can see.
If you ran the docker container with pid=host the pids will work.
Alternatively you can get the pid like so:
docker inspect --format {{.State.Pid}} <container>
But you don't actually need the pid, you can send a signal yourself using docker kill like so:
docker kill --signal=HUP some-container
Related
I'm building some CI pipelines, and part of it is a bash wrapper script around a docker container running ansible commands. The trouble I'm having is that on job abort the container keeps running, which is potentially dangerous.
What I have currently is:
#!/bin/bash
CONTAINER=ansible
function kill_container() {
echo "$0 caught $1" >&2
docker kill ${CONTAINER}
exit $?
}
trap 'kill_container SIGINT' SIGINT
trap 'kill_container SIGTERM' SIGTERM
function ansible_base() {
docker run -d --rm --name ${CONTAINER} someorg/ansible:latest $#
docker logs --follow ${CONTAINER}
}
ansible_base $#
and my local test is simply ./run.sh sleep 30.
For the purpose of reproducability, you can substitute alpine:latest as the image and it behaves the same.
Prior to adding -d to the run and the docker logs it did not respect SIGINT at all, but now it works as expected. Eg:
./ci/run.sh sleep 30
5f5d78cfea27cdc15f5fede2003352253ae3254f44489ab4689ebca8d0f91768
^C./ci/run.sh caught SIGINT
ansible
However, if I run a pkill run.sh from another terminal it still waits the full 30 seconds before handling the signal, raising an error that the container is already gone. Eg:
./ci/run.sh sleep 30
a642a1060dc9d340e92dc255d68a9d9cb26d62ec59c5ef8d4e3d4198f1692c3e
./ci/run.sh caught SIGTERM
Error response from daemon: Cannot kill container: ansible: Container a642a1060dc9d340e92dc255d68a9d9cb26d62ec59c5ef8d4e3d4198f1692c3e is not running
Ultimately, the observed behaviour in the CI system is the same. The process is issued a SIGTERM, and then after not responding for 30 seconds a SIGKILL. This terminates the wrapper script, but not the docker command.
As #brunson said, I needed an init process to handle signal propagation.
When I was originally writing this my thought was "it's just a command, it doesn't need an initd" which was somewhat true until the very instant I needed it to respect signals at all. Frankly it was a foolish thought in the first place.
Anyhow, to accomplish the fix I used tini.
Added to Dockerfile:
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
and run.sh is back down to a much more manageable:
#!/bin/bash
function ansible_base() {
docker run --rm someorg/ansible:latest "$#"
}
ansible_base "$#"
I am attempting to use bash to permanently kill the process sharingd.
I have tried using the command "sudo kill -9 (pid of port sharingd is using)", but sharingd just reopens on another port.
lsof -i
sudo kill -9 PID
My expected results should stop sharingd from running, but it just uses a different port each time.
Pardon my inability to display my code as actual code, I am somewhat new to Stack Overflow.
Let's consider following script:
#!/bin/bash
while true ; do: ; done
After running the script, the bash goes into loop, but can be interrupted (by pressing Ctrl-C or issuing kill -2 command) or terminated (by issuing kill command). All works perfectly well. But now let's consider another script:
#!/bin/bash
sleep 60
After running this script, bash process no longer reacts to SIGINT or SIGTERM signals. Of course it reacts to pressing Ctrl-C or to killing sleep process, but what I am interested in, is making bash process itself react to these signals. I need this because I am building Docker image with bash script as an entrypoint and Docker sends signals to PID 1 in the containter, which in my case will be bash process. I am struggling with making the container shut down gracefully. Bash process does not react to signals, so Docker kills it in order to shut down the containter.
Any help will be appreciated.
Consider this Docker file:
from centos:7
COPY entrypoint.sh /usr/bin/entrypoint.sh
RUN chmod 760 /usr/bin/entrypoint.sh
ENTRYPOINT ["/usr/bin/entrypoint.sh"]
with the corresponding entrypoint.sh script
#!/usr/bin/env bash
function finish {
# stop holding process here
echo "exciting gracefully . . ."
kill -TERM "$child" 2>/dev/null
exit 0
}
trap finish SIGHUP SIGINT SIGQUIT SIGTERM
# your process which holds the container, eg
sleep 60 &
child=$!
wait "$child
Build the image:
docker build --no-cache -t overflow .
Run the image:
docker run overflow:latest
if you CTRL+C within 60 seconds you'll see the output:
exciting gracefully . . .
Showing the signal has first killed your script and then the container.
A good resource on signals and containers can be found here
if your docker API 1.25+ you can run container
docker run --init -it
--init - Run an init inside the container that forwards signals and reaps processes
description from docker guide https://docs.docker.com/engine/reference/commandline/run/
I'm using the simplest docker ever:
FROM ubuntu
COPY script.sh /script.sh
CMD /script.sh
Where all the script does is:
#!/bin/bash
function sigterm() {
echo "Got SIGTERM"
exit
}
trap sigterm SIGTERM
i=1
while true; do
echo "$(date +%H:%M:%S) | $((i++)) | $HOSTNAME"
sleep 1
done
I'm running this container in Minikube, but I can't get it to catch any SIGTERM from kubernetes.
I tried deleting the pod/deployment or to scale it up and down. In no case it got SIGTERM before being deleted. It respects the terminationGracePeriodSeconds But doesn't seem to run the preStop command or send the SIGTERM before killing the pod.
Is that due to using minikube? or am I doing something else wrong?
(the deployment is not part of a service, it's just a deployment)
(SSH into the pod and manually kill-ing it works as expected)
Reading the Dockerfile documentation on CMD
The CMD instruction has three forms:
CMD ["executable","param1","param2"] (exec form, this is the preferred form)
CMD ["param1","param2"] (as default parameters to ENTRYPOINT)
CMD command param1 param2 (shell form)
If you use the shell form of the CMD, then the will execute in /bin/sh -c
So you are using the shell form, which means your command is /bin/sh -c script.sh. Then, when kubernetes sends a SIGTERM to the container, is not the script.sh process the one receiving the signal, but the /bin/sh process. That's why you don't see the "Got SIGTERM" message.
When creating a Dockerfile, make sure you use the exec form. Otherwise the application will be started as a subcommand of /bin/sh -c, which does not pass signals. The container’s PID1 will be the shell, your application will not receive any signals.
Try changing your Dockerfile to use the exec form
FROM ubuntu
COPY script.sh /script.sh
CMD ["/script.sh"]
I have a java process daemonized using daemon command (RHEL 6.2). I'm using following line to start the process and the line below to stop it:
daemon --command "/opt/my-service" --respawn --name=my-service --verbose
daemon --stop --name=my-service --verbose
Things work until I'll try to restart my process using stop/start approach:
daemon --stop --name=my-service --verbose
daemon --command "/opt/my-service" --respawn --name=my-service --verbose
It the process is running before the above command is executed, then existing process will be stopped but then new one will not be created. Instead, following line will be logged to the /var/log/messages:
Oct 27 07:59:46 myhostname my-service: my-service: fatal: failed to become a daemon: Resource temporarily unavailable
which as far as I understand means that we tried to acquire lock on the pid file but another process was holding the lock on it. Or in the other words: the original process was still running.
What is interesting it can not be reproduced by i.e. following command:
daemon --command "sleep 30s" --respawn --name=sleeper --verbose
daemon --stop --name=sleeper --verbose
daemon --command "sleep 30s" --respawn --name=sleeper --verbose
then there must be something in my process which causes/exploits asynchronous nature of --stop.
How can I make --stop blocking?
This might help:
while true; do
daemon --name=my-service --running --verbose | grep not
[ $? -eq 0 ] && exit
sleep 1
done
Description as p-code:
forever {
check if daemon --name=my-service --running --verbose returns some thing containing the word not
if yes, exit
otherwise sleep a second
}