Script invoked from remote server unable to run service correctly - bash

I have a unix script that invokes another script on a remote unix server.
amongst other commands i am stopping a service. The stop command essentially translates to
ssh -t -t -q ${AEM_USER}#${SERVERIP} 'bash -l -c "service aem stop"'
The service is getting stopped but when i start back the service it just creates the .pid file and does not perform the start up. When i run the command for start i.e.
ssh -t -t -q ${AEM_USER}#${SERVERIP} 'bash -l -c "service aem start"'
it does not show any error. On going to the server and checking the status
service aemauthor status
Below message is displayed
aem dead but pid file exists
Also when starting the service by logging in to the server, it works as expected along with the message
Removing stale pidfile (pid: 8701)
Starting aem

We don't know the details of the service script of aem.
I guess the problem is related to the SIGHUP signal. When we log off from a shell or disconnect from ssh, the OS will send HUP signal to all processes that started in this terminated shell. If the process didn't handle the HUP signal, it would exit by default.
When we run a command via ssh remotely, the process started by this command will receive HUP signal after ssh session is terminated.
We can use the nohup command to ignore the HUP signal.
You can try
ssh -t -t -q ${AEM_USER}#${SERVERIP} 'bash -l -c "nohup service aem start"'
If it works, you can use nohup command to start aem in the service script.

As mentioned at the stale pidfile syndrome, there are different reasons for pidfiles getting stalled, like for instance some issues with the way your handles its removal when the process exits... but considering your only experiencing when running remotely, I would guess it might be related to what is being loaded or not by your profile... check the most voted solid answer at the post below for some insights:
Why Does SSH Remote Command Get Fewer Environment Variables
As described in the comments of the mentioned post, you can try sourcing /etc/profile or ~/.bash_profile before executing your script to test it, or even trying to execute env locally and remotelly to compare variables that are being sourced or not.

Related

call a script automatically in container before docker stops the container

I want a custom bash script in the container that is called automatically before the container stops (docker stop or ctrl + c).
According to this docker doc and multiple StackOverflow threads, I need to catch the SIGTERM signal in the container and then run my custom script when the event appears. As I know SIGTERM can be only used from a root process with PID 1.
Relevand part of my Dockerfile:
...
COPY container-scripts/entrypoint.sh /
ENTRYPOINT ["/entrypoint.sh"]
I use [] to define the entrypoint and as I know this will run my script directly, without having a /bin/sh -c wrapper (PID 1), and when the script eventually exec another process, that process becomes the main process and will receive the docker stop signal.
entrypoint.sh:
...
# run the external bash script if it exists
BOOT_SCRIPT="/boot.sh"
if [ -f "$BOOT_SCRIPT" ]; then
printf ">> executing the '%s' script\n" "$BOOT_SCRIPT"
source "$BOOT_SCRIPT"
fi
# start something here
...
The boot.sh is used by child containers to execute something else that the child container wants. Everything is fine, my containers work like a charm.
ps axu in a child container:
PID USER TIME COMMAND
1 root 0:00 {entrypoint.sh} /bin/bash /entrypoint.sh
134 root 0:25 /usr/lib/jvm/java-17-openjdk/bin/java -server -D...
...
421 root 0:00 ps axu
Before stopping the container I need to run some commands automatically so I created a shutdown.sh bash script. This script works fine and does what I need. But I execute the shutdown script manually this way:
$ docker exec -it my-container /bin/bash
# /shutdown.sh
# exit
$ docker container stop my-container
I would like to automate the execution of the shutdown.sh script.
I tried to add the following to the entrypoint.sh but it does not work:
trap "echo 'hello SIGTERM'; source /shutdown.sh; exit" SIGTERM
What is wrong with my code?
Your help and comments guided me in the right direction.
I went through again the official documentations here, here, and here and finally I found what was the problem.
The issue was the following:
My entrypoint.sh script, which kept alive the container executed the following command at the end:
# start the ssh server
ssh-keygen -A
/usr/sbin/sshd -D -e "$#"
The -D option runs the ssh daemon in a NOT detach mode and sshd does not become a daemon. Actually, that was my intention, this is the way how I kept alive the container.
But this foreground process prevented to be executed properly the trap command. I changed the way how I started the sshd app and now it runs as a normal background process.
Then, I added the following command to keep alive my docker container (this is a recommended best practice):
tail -f /dev/null
But of course, the same issue appeared. Tail runs as a foreground process and the trap command does not do its job.
The only way how I can keep alive the container and let the entrypoint.sh runs as a foreign process in docker is the following:
while true; do
sleep 1
done
This way the trap command works fine and my bash function that handles the SIGINT, etc. signals runs properly when the time comes.
But honestly, I do not like this solution. This endless loop with a sleep looks ugly, but I have no idea at the moment how to manage it in a nice way :(
But this is another question that not belongs to this thread (but could be great if you can suggest my a better solution).

Can't terminate node(js) process without terminating ssh server in docker container

I'm using a Dockerfile that ends with a CMD ["/start.sh"]:
#!/bin/bash
service ssh start
/usr/bin/node /myApp/app.js
if for some reason i need to kill the node process, the ssh server is being closed as well (forces me to reboot the container to reconnect).
Any simple way to avoid this behavior?
Thank You.
The container exits as soon as main process of the container exits. In your case, the main process inside the container is start.sh shell script. The start.sh shell script is starting the ssh service and then running the nodejs process as child process. Once the nodejs process dies, the shell script exits as well and so the container exits. So what you can do is to put the nodejs process in background.
#!/bin/bash
service ssh start
/usr/bin/node /myApp/app.js &
# Need the following infinite loop as the shell script should not exit
while do:
sleep 2
done
I DO NOT recommend this approach though. You should have only a single process per container. Read the following answers to understand why -
Running multiple applications in one docker container
If you still want to run multiple processes inside container, there are better ways to do it like using supervisord - https://docs.docker.com/config/containers/multi-service_container/

Run SSH command nohup then exit from server via Jenkins

So I've tried googling and reading a few questions on here as well as elsewhere and I can't seem to find an answer.
I'm using Jenkins and executing a shell script to scp a .jar file to a server and then sshing in, running the build, and then exiting out of the server. However, I cannot get out of that server for the life of me. This is what I'm running, minus the sensitive information:
ssh root#x.x.x.x 'killall -9 java; nohup java -jar /root/project.jar -prod &; exit'
I've tried doing && exit, exit;, but none of it will get me out of the server and jenkins just spins for ever. So the Jenkins build never actually finishes.
Any help would be sweet! I appreciate it.
So I just took off the exit and ran a ssh -f root#x.x.x.x before the command and it worked. The -f just runs the ssh command in the background so Jenkins isn't sitting around waiting.
Usual way of starting a command and sending it to background is nohup command &
Try this. This is working for me. Read the source for more information.
nohup some-background-task &> /dev/null # No space between & and > !
Example :
ssh root#x.x.x.x 'killall -9 java; nohup java -jar /root/project.jar -prod &> /dev/null'
no need exit keyword
source : https://blog.jakubholy.net/2015/02/17/fix-shell-script-run-via-ssh-hanging-jenkins/

Setup and use SSH ControlMaster Session in a Shell Script

I'm writing a script which has several sets of commands that it needs to run on a remote server, with processing of results in between. Currently this is achieved by running ssh for each set of commands, however this requires a new connection to be made and authenticated each time, which is slow.
I recently read about the ControlMaster option in SSH, which seems like exactly what I need, namely the ability to run separate SSH sessions through a single SSH connection.
However, what I'm extremely unclear on is how exactly I would achieve this in my shell script. For example, I was thinking of constructing it like so:
#!/bin/sh
HOST="$1"
# Make sure we clean up after ourselves
on_complete() {
kill $ssh_control_master_id
rm -r "$tmp_dir"
}
trap 'on_complete 2> /dev/null' SIGINT SIGHUP SIGTERM EXIT
tmp_dir=$(mktemp -d "/tmp/$(basename "$0").XXXXXX")
ssh_control_socket="$tmp_dir/ssh_control_socket"
# Setup control master
ssh -o 'ControlMaster=yes' -S "$ssh_control_socket" "$HOST" &
ssh_control_master_id=$!
# Run initial commands
data=$(ssh -S "$ssh_control_socket" "$HOST" 'echo "Foo"')
# Process the data
echo "$data"
# Run some more commands
data=$(ssh -S "$ssh_control_socket" "$HOST" 'echo "Bar"')
# Process the second batch of data
echo "$data"
Just a simple example to give you an idea, but this doesn't seem to be the correct way to do this, as running it will either cause the second ssh command to hang, or each command will just run normally (create their own connection). I'm also not sure how to go about waiting for the master connection to be established, i.e - I'm probably running my actual commands while the remote connection is still being established.
Also on a related note, what is the correct way to close the control master once it's running, is killing it and/or deleting its socket fine?
Your code looks fine. I haven't tested it, but the first process that tries to use the master connection should probably block until the master connection has actually successfully been established. You can use the -N option to avoid running a spurious shell on the master connection:
ssh -N -o 'ControlMaster=yes' -S "$ssh_control_socket" "$HOST" &
It's perfectly fine to simply kill the ssh process once all the subordinate sessions have completed.

How to make ssh to kill remote process when I interrupt ssh itself?

In a bash script I execute a command on a remote machine through ssh. If user breaks the script by pressing Ctrl+C it only stops the script - not even ssh client. Moreover even if I kill ssh client the remote command is still running...
How can make bash to kill local ssh client and remote command invocation on Crtl+c?
A simple script:
#/bin/bash
ssh -n -x root#db-host 'mysqldump db' -r file.sql
Eventual I found a solution like that:
#/bin/bash
ssh -t -x root#db-host 'mysqldump db' -r file.sql
So - I use '-t' instead of '-n'.
Removing '-n', or using different user than root does not help.
When your ssh session ends, your shell will get a SIGHUP. (hang-up signal). You need to make sure it sends that on to all processes started from it. For bash, try shopt -s huponexit; your_command. That may not work, because the man page says huponexit only works for interactive shells.
I remember running into this with users running jobs on my cluster, and whether they had to use nohup or not (to get the opposite behaviour of what you want) but I can't find anything in the bash man page about whether child processes ignore SIGHUP by default. Hopefully huponexit will do the trick. (You could put that shopt in your .bashrc, instead of on the command line, I think.)
Your ssh -t should work, though, since when the connection closes, reads from the terminal will get EOF or an error, and that makes most programs exit.
Do you know what the options you're passing to ssh do? I'm guessing not. The -n option redirects input from /dev/null, so the process you're running on the remote host probably isn't seeing SIGINT from Ctrl-C.
Now, let's talk about how bad an idea it is to allow remote root logins:
It's a really, really bad idea. Have a look at HOWTO: set up ssh keys for some suggestions how to securely manage remote process execution over ssh. If you need to run something with privileges remotely you'll probably want a solution that involves a ssh public key with embedded command and a script that runs as root courtesy of sudo.
trap "some_command" SIGINT
will execute some_command locally when you press Ctrl+C . help trap will tell you about its other options.
Regarding the ssh issue, i don't know much about ssh. Maybe you can make it call ssh -n -x root#db-host 'killall mysqldump' instead of some_command to kill the remote command?
What if you don't want to require using "ssh -t" (for those as forgetful as I am)?
I stumbled upon looking at the parent PID, because CTRL/C from the initiating session results in the ssh-launched process on the remote process exiting, although its child process continues. By way of example, here's my script that is on the remote server.
#!/bin/bash
Answer=(Alive Dead)
Index=0
while [ ${Index} -eq 0 ]; do
if ! kill -0 ${PPID} 2> /dev/null ; then Index=1; fi
echo "Parent PID ${PPID} is ${Answer[$Index]} at $(date +%Y%m%d%H%M%S%Z)" > ~/NowTime.txt
sleep 1
done
I then invoke it with "ssh remote_server ./test_script.sh"
"watch cat ~/NowTime.txt" on the remote server shows the timestamp in the file increasing and declaring that the parent process is alive; once I hit CTRL/C in the launching process, the script on the remote server notes that its parent process has died, and the script exits.

Resources