I am trying to run a script to start, stop, or restart a bot from my front end webpage.
I have a bot that runs almost 24/7 on a Linux EC2 instance, and a webpage front end that allows for parameter input and shows the current status of the bot. The front end sends a POST request to a lambda function, which writes the parameters to my S3 bucket. The script to start the bot on the EC2 instance pulls the latest parameters from S3 and initializes the bot. When the bot starts up and shuts down, it writes the status ("running", "stopped") to a file in the S3 bucket, which then shows on the front end.
I have looked into SSM Run Command with Lambda, but given that the bot runs for days at a time, I don't believe that's viable. Additionally, it uses an agent to connect, so trying to use the screen command would terminate when the agent terminates.
I have also tried adding the script to my EC2 instance’s User Data, but that does not seem to work. Similarly a cron job for reboot does not work.
I've considered using a trigger file in S3, i.e. having the EC2 instance check at a given time interval for some trigger file in S3 that would indicate a start or stop, but that seems very resource intensive.
What alternatives do I have?
The solution that worked for me was setting up a crontab job that runs on reboot, then starting, stopping, and restarting the EC2 instance with a lambda function.
Steps to resolve this for anyone in the same boat:
SSH into the EC2 instance
crontab -e
add the following line:
#reboot sleep 60 && cd /home/ec2-user/bot_folder/ && /usr/bin/screen -S bot -dm /usr/bin/python3 run_bot.py
(for vim, press i to enter insert mode, paste the line and make changes, then press esc :wq enter to save)
Ensure that the script has all of its paths specified absolutely. In my case, using Selenium, the chromedriver path needed to be specified.
Finally, setup a lambda function to start/stop/reboot your instance as the comment above referenced.
I want to terminate my ec2 instance between 1hour and 1h30 min (random interval) after it starts.
How can i achieve this?
Using cron job or at command
Below is the working code, But it will be good if i can do this using crontab instead of sleep command.
sleep $(shuf -i 3600-5400-n 1) && aws ec2 terminate-instances --instance-ids $AWS_INSTANCE_ID --region ${region}
Thanks
When the instance starts, you could run a script (eg via User Data) that:
Sleeps for the desired time period (eg sleep 3600)
Does a shutdown (eg sudo shutdown -h now)
Be sure to set your termination behaviour to Terminate otherwise the instance will simply stop.
I needed to add a shutdown hooks on my EC2 instances to do some resource clean up stuff.
Moreover, I would also be able to start and stop manually my instance for testing purpose and I wanted the startup and shutdown hooks to be triggered the same way as on the initial bootstrap.
I then decided to install a script as a service on AWS EC2 Ubuntu 16.04 LTS instance via a Cloudformation bash script.
Here is the first naive version of the script:
UserData:
"Fn::Base64":
!Sub
- |
#!/usr/bin/env bash
BOOTSTRAP_SCRIPT_NAME=bootstrap
BOOTSTRAP_SCRIPT_PATH=/etc/init.d/${BOOTSTRAP_SCRIPT_NAME}
cat > /etc/init.d/boostrap <<EOF
### BEGIN INIT INFO
# Provides: ${BOOTSTRAP_SCRIPT_NAME}
# Required-Start: \\\$local_fs \\\$remote_fs \\\$network \\\$syslog \\\$named
# Required-Stop: \\\$local_fs \\\$remote_fs \\\$network \\\$syslog \\\$named
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Bootstrap an instance
# Description: Bootstrap an instance
### END INIT INFO
function start() {
echo "STARTUP on $(date)"
}
function stop() {
echo "SHUTDOWN on $(date)"
}
case "\$1" {
start)
start | tee -a /var/log/${BOOTSTRAP_SCRIPT_NAME}.log
;;
stop)
stop | tee -a /var/log/${BOOTSTRAP_SCRIPT_NAME}.log
;;
}
EOF
chmod +x ${BOOTSTRAP_SCRIPT_PATH}
update-rc.d -f ${BOOTSTRAP_SCRIPT_NAME} remove
update-rc.d ${BOOTSTRAP_SCRIPT_NAME} defaults
With this version, the bootstrap script is never started.
I quickly understood that the bootstrap script was installed during the cloud-init phase and by the way during the linux sysv init phase and would not take part of the current init phase ... (If this is wrong tell me ;-))
I then decided to start it manually such as apache2 in cloudformation bash examples. I added the following line at the end of the script.
${BOOTSTRAP_SCRIPT_PATH} start
I tested it again, and saw the "STARTUP on XXX" log in the bootstrap.log file after this fix.
But when I tried to stop the instance in the consol, no "SHUTDOWN on XXX" logs appeared in the bootstrap.log file ...
I log into the instance and try to start/stop the script manually ... all the startup and shutdown logs appeared 8-O. I then supposed that as the boostrap script was not identified as an init script the stop callback would not be called on instance stop or terminate ... (If this is wrong tell me ;-))
I then start and stop several times the instance from the AWS console and both STARTUP and SHUTDOWN messages still appeared in the logs.
This confirmed my hypothesis. The logs are only missing during the first init and shutdown cycle.
So I did something weird and ugly ... I replace the last line start command with this one :
reboot -n
The script now works as I need but I think there should be a cleaner way to enable my script for init or a least for the shutdown phase during cloud-init without rebooting ...
Is anyone has a best solution or more details on the issue ?
PS : I tried init u and telinit u instead of reboot with no success
The reason for this seems to be that the bootstrap is not started as a service the first time. It is run as a normal script. Instead of ${BOOTSTRAP_SCRIPT_PATH} start, try adding the following line to your user-data:
sudo service ${BOOTSTRAP_SCRIPT_NAME} start
I am designing an Auto Scaling system for my application which runs on Amazon EC2 instances. The application reads messages from SQS and processes them.
The Auto Scaling system will monitor two things:
The number of messages in the SQS,
The total number of processes running on all EC2 machines.
For example, if the number of messages in SQS exceeds 3000 I want the system to autoscale, create a new EC2 instance, deploy code on it and whenever the number of messages goes below 2000 i want the system to terminate an EC2 instance.
I am doing this with Ruby and capistrano.
My question is:
I am unable to find a method to determine number of processes running on all EC2 machines and save the number inside a variable. Could you help me?
You might want to utilize cron and CloudWatch API to push the numbers manually to CloudWatch as a part of auto-scaling-group policy. By numbers I mean the number of processes from each instance ps aux | grep your_process | wc -l
CloudWatch will let you set alarm for that manual metric aggregated by SUM of the nr of processes across either all running instances or by auto-scaling-group.
Something to let you get started:
Pushing RAM Memory metrics manually:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/mon-scripts-perl.html
One more:
http://aws.typepad.com/aws/2011/05/amazon-cloudwatch-user-defined-metrics.html
For memory it looks simple, as amazon already provides scripts for this. For processes you might need to dig in these scripts or read the official API docs
EDIT:
If you are now worried about single-point-of-failure in the watching system and you have a list of servers it might be preferred to examine them in parallel from a remote server:
rm ~/count.log
# SSH in parallel
for ROW in `cat ~/ListofIP.txt`
do
IP=`echo ${ROW} | sed 's/\./ /g' | awk '{print $1}'`
ssh -i /path/to/keyfile root#${IP} "ps -ef | grep process_name.rb | grep -v grep | wc -l" >> ~/count.log &
done
# Wait for totals
while [ ! `wc -l ~/ListofIP.txt` -eq `wc -l ~/count.log` ]
do
wait 1
done
# Sum up numbers from ~/count.log
# Push TO CloudWatch
I'm trying to write a shell script that automates certain startup tasks based on my location (home/campusA/campusB). I go to University and take classes in two different campuses (hence campusA/campusB). My location is determined by which wireless network I'm connected to. For the purposes of this script, we can assume that I will be connected to one of these networks when the script is called and my script knows which one I'm connected to based on a call to iwconfig.
This is what I want it to do:
cat file1 > file2 # always do this, regardless of where I am
if Im at home:
start tweetdeck, thunderbird, skype
else if Im at campusA:
activate the login script # I need to login on a webform before I get internet access.
# I have written a script to automate this.
# Wait for this script to finish before doing anything else
myProg2 & # I want myProg2 running in the background until I shutdown my computer.
else if Im at campusB:
ssh username#domain # this is the problematic line
myProg2 & # I want myProg2 running in the background until I shutdown my computer.
start tweetdeck, thunderbird
close the terminal with the "exit" command
The problem is that campusB's wireless network is behind a firewall, which grants me internet access ONLY after I successfully ssh by username#domain. After a successful ssh, I need to keep the terminal window active in order to hold keep the internet access. If I close the terminal window, I lose internet access (this is bad).
When I try doing just ssh username#domain, the script stops because I don't exit the ssh command. I can't ^C out of it, which means that the rest of the script is never executed. I also have the same problem if I just close the terminal window in an attempt to kill the ssh session.
Some googling brought me to subshell, which I'm either using wrong or can't use to solve my problem. So how should I go about solving this problem? I'd appreciate any help - I've been at this for a while now and am unable to find anything helpful. If it makes a difference, I'd rather not store my ssh password in the script
Further, ampersanding the ssh call (ssh username#domain &) doesn't seem to do any good (can anyone explain why?)
Thank you in advance
EDIT
I must clarify, that the ssh connection has to be active in order for me to have internet access. Thus, when I close the terminal window, I need the ssh connection to still be active.
I had a script that looped on 6 servers, calling via ssh in the background. In 1 part of the script, there was a mis-behaving vendor application; the application didn't 'let go' of the connection properly. (other parts of the script using ssh in background worked fine).
I found that using ssh -t -t cured the problem. Maybe this can help you too.
(a teammate found this on the web, and we had spent so much time, I never went back to read the article that suggested this. The man page on our system gave no hint that such a thing was possible)
Hope this helps.
You may want to try to double background myProg2 to detach it from the tty:
# cf. "Wizard Boot Camp, Part Six: Daemons & Subshells",
# http://www.linux-mag.com/id/5981
(myProg2 &) &
Another option may be to use the daemon tool from the libslack package:
http://ingvar.blog.linpro.no/2009/05/18/todays-sysadmin-tip-using-libslack-daemon-to-daemonize-a-script/
Having a ssh with pseudy tty on background shell
In addition to #shellter's answer, I would like make some precision:
where #shelter said:
The man page on our system gave no hint that such a thing was possible
On my system (Debian 7 GNU/Linux), if I hit:
man -Pcol\ -b ssh| grep -A3 '^ *-t '
I could read:
-t Force pseudo-tty allocation. This can be used to execute arbi‐
trary screen-based programs on a remote machine, which can be
very useful, e.g. when implementing menu services. Multiple -t
options force tty allocation, even if ssh has no local tty.
Yes: Multiple -t options force tty allocation, even if ssh has no local tty.
This mean: If you remotely run a tool that require access to pseudo terminal ( pty like /dev/pts/0), you could run them by using -t switch.
But this would work only if ssh is run from a shell console (aka having his own pty). If you plan to run them is shell session without console, like background scripts, you may use Multiple -t to enforce pseudo tty allocation from ssh.
Multiple ssh shell on one ssh connection
In addition to answers from #tommy and #geekosaur, I would make some precision:
#tommy point to a very intersting feature of ssh. Not sure this have a lot to do with answer, but speaking around long time connection, this feature has to be clearly understood.
Once a connection is established, ssh could (and know how to) use them to drive a lot of thing in this one connection:
-L let you drive remote TCP connections to local machines/network. (full syntax is: -L localip:localport:distip:distport) where localip could be specified to permit other hosts from same local domain to access same tcp bind, and distip could by any host from distant network ( not only localhost ) sample: -L192.168.1.31:8443:google.com:443 permit any host from local domain to reach google through your host: http://192.168.1.31:8443
-R Same remarks in reverse way!
-M Tell ssh to open a local unix socket for bindind next ssh consoles. Simply open two terminal window. First in both window, hit: ssh somewhere than hit netstat -tan | grep :22 or netstat -tan | grep 192.168.1.31:22 (assuming 192.168.1.31 is your onw host's ip)
Than compare close all your ssh session and in first terminal, hit: ssh -M somewhere and in second, simply ssh somewhere. you may see in second terminal:
$ ssh somewhere
+ ssh somewhere
Last login: Mon Feb 3 08:58:01 2014 from elsewhere
If now you hit netstat -tan | grep 192.168.1.31:22 (on any of two oppened ssh session;) you must see that there is only one tcp connection.
This kind of features could be used in combination with -L and maybe some sleep 86399...
To work around a tcp killer router that close every inactive TCP connection from more than 120 seconds, I run:
ssh -M somewhere 'while :;do uptime;sleep 60;done'
This ensure connection stay up even if I dont hit a key for more than two minutes.
Here's a few thoughts that might help.
Sub-shells
Sub-shells fork new processes, but don't return control to the calling shell. If you want to fork a sub-shell to do the work for you, then you'll need to append a & to the line.
(ssh username#domain) &
But this doesn't look like a compelling reason to use a sub-shell. If you had a number commands you wanted to execute in order from each other, yet in parallel from the calling shell, then maybe it would be worth it. For example...
(dothis.sh; thenthis.sh; andthislastthingtoo.sh) &
Forking
I'm not sure why & isn't working for you, but it may be worth looking into nohup as well. This makes the command "immune" to hang up signals.
nohup ssh username#domain (try with and without the & at the end)
Passwords
Not storing passwords in the script is essential for any ssh automation. You can accomplish that using public key cryptography which is an inherent feature of ssh. I wont go into the details here because there are a number of great resources all across the interwebs on setting this up. I strongly suggest investigating this further.
HOWTO: set up ssh keys - Paul Keck, 2001
SSH Keys - archlinux.org
SSH with authentication key instead of password - Debian Administration
Secure Shell - Wikipedia, the free encyclopedia
If you do go this route, I also suggest running ssh in "batch mode" which will disable password querying and will automatically disconnect from the server if it becomes unresponsive after 5 minutes.
ssh -o 'BatchMode=yes' username#domain
Persistence
Then if you want to persist the connection, run some silly loop in bash! :)
ssh -o 'BatchMode=yes' username#domain "while (( 1 == 1 )); do sleep 60; done"
The problem with & is that ssh loses access to its standard input (the terminal), so when it goes to read something to send to the other side it either gets an error and exits, or is killed by the system with SIGTTIN which will implicitly suspend it. The -n and -f options are used to deal with this: -n tells it not to use standard input, -f tells it to set up any necessary tunnels etc., then close the terminal stream.
So the best way to do this is probably to do
ssh -L 9999:localhost:9999 -f host & # for some random unused port
and then manually kill the ssh before logout. Alternately,
ssh -L 9999:localhost:9999 -n host 'while :; do sleep 86400; done' </dev/null &
(The redirection is to make sure the SIGTTIN doesn't happen anyway.)
While you're at it, you may want to save the process ID and shut it down from your .logout/.bash_logout:
ssh -L 9999:localhost:9999 -n host 'while :; do sleep 86400; done' < /dev/null & echo $! >~.ssh_pid; chmod 0600 ~/.ssh_pid
and in .bash_logout:
if test -f ~/.ssh_pid; then
set -- $(sed -n 's/^\([0-9][0-9]*\)$/\1/p' ~/.ssh_pid)
if [ $# = 1 ]; then
kill $1 >/dev/null 2>&1
fi
rm ~/.ssh_pid
fi
The extra code there attempts to avoid someone sabotaging your ~/.ssh_pid, because I'm a professional paranoid.
(Code untested and may have typoes)
It's been a while since I've used ssh, and I can't test it right now, but have you tried the -f switch?
ssh -f username#domain
The man page says it backgrounds ssh. Not sure why & wouldn't work, but I guess it's interpreting it as a command to be run on the remote machine.
Maybe screen + ssh would fit the bill as well?
Something like:
screen -d -m -S sessionName cmd
screen -d -m -S sessionName cmd &
# reconnect with
screen -r sessionName