Script executed in EC2 User Data with nohup doesn't run until manually SSHing in - bash

I have the following command in my EC2 User Data:
setsid nohup /root/go/src/prometheus-to-cloudwatch-master/dist/bin/prometheus-to-cloudwatch --cloudwatch_namespace TestBox/Prometheus --cloudwatch_region eu-west-1 --cloudwatch_publish_timeout 5 --prometheus_scrape_interval 30 --prometheus_scrape_url MyScrapeUrl &
This script will take metrics published at the prometheus_scrape_url and push them to CloudWatch. By default this script runs in the foreground and every 30 seconds will output the number of metrics pushed to CloudWatch. I have added setsid nohup to run the script in the background in a new session.
The issue here is that the script doesn't seem to run until I SSH into the box following initialisation and su to the root user (it's like it's queued to be run when I next SSH as the root user).
My expected behaviour is that the script runs as part of the user data and I should never need to SSH into the box.
The script in question is: https://github.com/cloudposse/prometheus-to-cloudwatch

Related

Initialize and terminate a bot on EC2 remotely

I am trying to run a script to start, stop, or restart a bot from my front end webpage.
I have a bot that runs almost 24/7 on a Linux EC2 instance, and a webpage front end that allows for parameter input and shows the current status of the bot. The front end sends a POST request to a lambda function, which writes the parameters to my S3 bucket. The script to start the bot on the EC2 instance pulls the latest parameters from S3 and initializes the bot. When the bot starts up and shuts down, it writes the status ("running", "stopped") to a file in the S3 bucket, which then shows on the front end.
I have looked into SSM Run Command with Lambda, but given that the bot runs for days at a time, I don't believe that's viable. Additionally, it uses an agent to connect, so trying to use the screen command would terminate when the agent terminates.
I have also tried adding the script to my EC2 instance’s User Data, but that does not seem to work. Similarly a cron job for reboot does not work.
I've considered using a trigger file in S3, i.e. having the EC2 instance check at a given time interval for some trigger file in S3 that would indicate a start or stop, but that seems very resource intensive.
What alternatives do I have?
The solution that worked for me was setting up a crontab job that runs on reboot, then starting, stopping, and restarting the EC2 instance with a lambda function.
Steps to resolve this for anyone in the same boat:
SSH into the EC2 instance
crontab -e
add the following line:
#reboot sleep 60 && cd /home/ec2-user/bot_folder/ && /usr/bin/screen -S bot -dm /usr/bin/python3 run_bot.py
(for vim, press i to enter insert mode, paste the line and make changes, then press esc :wq enter to save)
Ensure that the script has all of its paths specified absolutely. In my case, using Selenium, the chromedriver path needed to be specified.
Finally, setup a lambda function to start/stop/reboot your instance as the comment above referenced.

Background process doesn't work after closing the ssh

I'm using an SSH script for remote deployment. After push to the master branch the repository connects via SSH and calls deploy.sh which should run in the background so as not to waste the devops node's running time. Inside the script, docker restarts and pulls from the repo. I tried nohup, &, disown - but the result remains the same: the script only works correctly if it terminates within the SSH console. If we close the SSH connection, nothing happens, as if we never called the script. Why would this happen?

Vagrant shell incompatible with foreground "server boot" commands

I'm running a shell script on vagrant up via the inline shell config of a Vagrantfile. One of the commands starts up a tomcat web server which normally runs in the foreground.
My dilemma is that a) the commands in the vagrant shell script should exit or run in the background so that the prompt returns to the user correct, and b) if I send the output to the background with & the output isn't visible and the user has no idea when the web server has finished booting.
I either need a way to send output to the background and tell the user when the server has booted, or a way to send to the background once the server has booted. Without messing with the maven/tomcat side I don't see a way to do it.
$script = <<-SCRIPT
# other commands here
mvn tomcat7:run &
SCRIPT
config.vm.provision "shell", inline: $script, privileged: false, run: "always"
I use nohup command for this and redirect the output of the command in the specific log file - It does not fully answer the and tell the user when the server has booted
here's an example of a command I run
nohup java -jar /test/selenium-server-standalone-$1.jar -role hub &> /home/vagrant/nohup.grid.out&
If from the provisioning shell you'd want to give as much as possible information to the user, you could use sleep like 5/10 seconds (depending your deployment) then run tail -20 <log_file> so that would give users a good status of the progress of the task

run script at termination of process

I'm computing on a AWS EC2 type environment. I have a script that runs on the headnode, then executes a shell command on a VM which it detachs from the headnode parent shell.
headnode.sh
#!/bin/bash
...
nohup ssh -i vm-key ubuntu#vm-ip './vm_script.sh' &
exit 0
The reason I do it like this, is that the vm_script takes a very long time to finish. So I want to detach it from the headnode shell, and have it run in the background of the VM.
vm_script.sh
#!/bin/bash
...
ssh -i head-node-key user#head-node-ip './delete-vm.sh'
exit 0
Once the VM completes the task it sends a command back to the headnode to delete the VM.
Is there a more reliable way to execute the final script on the headnode following completion of the script on the VM?
E.g., After I launch the vm_script.sh, could I launch another script on the headnode that waits for the completion of the PID on the VM?
Assuming you trust in the network link, the more general reliable way is the simpler way. I would rewrite headnode.sh as presented below:
#!/bin/bash
...
ssh -i vm-key ubuntu#vm-ip './vm_script.sh' && ./delete-vm.sh
The script above waits the termination of vm_script.sh and then invokes ./delete-vm.sh if no error code is returned by vm_script.sh.
If you don't want to be stuck in the shell waiting for the termination of headnode.sh, just call it using nohup ./headnode.sh &

Evoking and detaching bash scripts from current shell

I'm have a tad bit of difficulty with developing bash based deployment scripts for a pipeline I want to run on an OpenStack VM. There are 4 scripts in total:
head_node.sh - launches the vm and attaches appropriate disk storage to the VM. Once that's completed, it runs the scripts (2 and 3) sequentially by passing a command through ssh to the VM.
install.sh - VM-side, installs all of the appropriate software needed by the pipeline.
run.sh - VM-side, mounts storage on the VM and downloads raw data from object storage. It then runs the final script, but does so by detaching the process from the shell created by ssh using nohup ./pipeline.sh &. The reason I want to detach from the shell is that the next portion is largely just compute and may take days to finish. Therefore, the user shouldn't have to keep the shell open that long and it should just run in the background.
pipeline.sh - VM-side, essentially a for loop that iterates through a list of files, and sequential runs commands on those and intermediate files. The result are analysed which are then staged back to the object storage. The VM then essentially tells the head node to kill it.
Now I'm running into a problem with nohup. If I launch the pipeline.sh script normally (i.e. without nohup) and keep it attached to that shell, everything runs smoothly. However, if I detach the script, it errors out after the first command in the first iteration of the for loop. Am I thinking about this the wrong way? What's the correct way to do this?
So this is how it looks:
$./head_node.sh
head_node.sh
#!/bin/bash
... launched VM etc
ssh $vm_ip './install.sh'
ssh $vm_ip './run.sh'
exit 0
install.sh - omitted - not important for the problem
run.sh
#!/bin/bash
... mounts storage downloads appropriate files
nohup ./pipeline.sh > log &
exit 0
pipeline.sh
#!/bin/bash
for f in $(find . -name '*ext')
do
process1 $f
process2 $f
...
done
... stage files to object storage, unmount disks, additional cleanups
ssh $head_node 'nova delete $vm_hash'
exit 0
Since I'm evoking the run.sh script from an ssh instance, subprocesses launched from the script (namely pipeline.sh) will not properly detach from the shell and will error out on termination of the ssh instance evoking run.sh. The pipeline.sh script can be properly detached by calling it from the head node, e.g., nohup ssh $vm_ip './pipeline.sh' &, this will keep the session alive until the end of the pipeline.

Resources