I'd like to stop an instance automatically after a time limit, say 1 hour.
This is for the case, the network connection get lost and I can't send a stop request.
I'd like to start instances programmatically via AWS SDK for PHP. So maybe there is a way for passing options for a time limit.
By right clicking on an instance, there is a "Create Alarm" option with a "Stop Instance" action. But there is no "running time" condition available.
The really easy way would be to add a shutdown -h 60 to your startup script on /etc/rc.local if you're using linux.
Related
for my jelastic servers as i dont use much would like to put them in something similar to sleep, that they are only activated in http request
i saw for trial accounts sleeps something, but would like to know if there would be a way to do it with a normal account.
For instance i had the idea of making a script to turn them off at night, but i dont know how to wake up.
any ideas are welcom
https://ops-docs.jelastic.com/jca-sleep-results
https://ops-docs.jelastic.com/jca-sleep-results
There is a start stop scheduler in the marketplace within the Jelastic dashboard. Check "Env Start/Stop Scheduler" in Marketplace > Add-Ons.
If you're interested in the code, or can't find that add-on at your Jelastic provider, you can find it at https://github.com/jelastic-jps/start-stop-scheduler
Note that this is not quite the same as sleep (it will not wake up automatically when there's a http request) - it will be completely offline during the hours that you specify.
Is it possible to create a script that is always running on my VPS server? And what need i to do to run it the hole time? (I haven't yet a VPS server, but if this is possible i wants to buy one!
Yes you can, there are many methods to get your expected result.
Supervisord
Supervisord is a process control system that keeps any process running. It automatically start or restart your process whenever necessary.
When to use it: Use it when you need a process that run continuously, eg.:
A queue worker that reads a database continuously waiting for a job to run.
A node application that acts like a daemon
Cron
Cron allow you running processes regularly, in time intervals. You can for example run a process every 1 minute, or every 30 minutes, or any time interval you need.
When to use it: Use it when your process is not long running, it do a task and end, and you do not need it beign restarted automatically like on Supervisord, eg.:
A task that collects logs everyday and send it on a gzip by email
A backup routine.
Whatever you choose, there are many tutorials on the internet on how configuring both, so I'll not go into this details.
We're looking at using EC2 autoscaling to deal with spikes in load. In our case we want to scale up instances based on an SQS queue size and then down scale with the queue size gets back under control. Each SQS message defines a potentially long running job (sometimes up to 20 minutes each for message) that must complete before the instance can be terminated.
Our software handles the shutdown process gracefully, so issuing sudo service ourapp stop will wait for the app to complete before returning.
My question; when autoscaling starts scaling down it issues a terminate (which apparently is like hitting the power button), will it wait for for our app to completely exit before the instance is 'powered off'?
https://forums.aws.amazon.com/message.jspa?messageID=180674 <- that and other things I've found seem to suggest that it doesn't
On most newer AMI's, the machines are given the equivalent to a 'halt' (or 'shutdown -h now' command so that the services are gracefully shut down. As long as your program plays nicely with the startup/shutdown scripts, you should be fine -- but, if your program takes more than 20 seconds to terminate, you may experience that amazon will kill the instance completely.
Amazon's documentation with regards to their autoscaling doesn't specify the termination process, but, AWS's documentation for ec2 in general does contain about what happens during the termination process -- that the machines is given a 'shutdown' command, and the default shutdown time on most systems is 30 seconds.
In mid 2014 AWS introduced 'lifecycle hooks' which allows for full control of the termination process.
Our high level down scale process is:
Auto Scaling sends a message to a SQS queue with an instance ID
Controller app picks up the message
Controller app issues a 'stop instance' request
Controller app re-queues the SQS message while the instance is stopping
Controller app picks up the message again, checks if the instance has stopped (or re-queues the message to try again later)
Controller app notifies Auto Scaling to 'PROCEED' with the termination
Controller app deletes the message from the SQS queue
More details: http://docs.aws.amazon.com/autoscaling/latest/userguide/lifecycle-hooks.html
use replaceunhealty option in autoscaling.
refer:
http://alestic.com/2011/11/ec2-schedule-instance
particularly see this comment.
I am wondering if there is a way to monitor these automatically. Right now, in our production/QA/Dev environments - we have bunch of services running that are critical to the application. We also have automatic ETLs running on windows task scheduler at a set time of the day. Currently, I have to log into each server and see if all the services are running fine or not, or check event logs for any errors, or check task scheduler to see if ETLs ran well etc etc... I have to do all the manually... I am wondering if there is a tool out there that will do the monitoring for me and send emails only in case something needs attention (like ETLs fail to run, or service get stopped for whatever reason or errors in event log etc). Thanks for the help.
Paessler PRTG Network Monitor can do all that. we have very good experience with it.
http://www.paessler.com/prtg/features
Nagios is the best tool for monitoring. It checks for the server status as well the defined services in it and if any service goes down or system goes down, sends the mail to specified mail id.
Refer the : http://nagios.org/
Thanks for the above information. I looked at the above options but they have a price.. what I did is an inexpensive way to address my concerns..
For my windows task scheduler jobs that run every night - I installed this tool/service from codeplex that is working great.
http://motash.codeplex.com/documentation#CommentsAnchor
For Windows services - I am just setting the "Recovery" Tab in each service "property" with actions to do when it fails. (like restart, reboot, or run a program which could be an email that will notify)
I built a simple tool (https://cronitor.io) for monitoring periodic/scheduled tasks. The name is a play on "cron" from the unix world, but it is system/task agnostic. All you have to do is make an http request to a unique tracking URL whenever your job runs. If your job doesn't check-in according to the rules you define then it will send you an email/sms message.
It also allows you to track the duration of your jobs by making calls at the beginning and end of your task. This can be really useful for long running jobs since you can be alerted if they start taking too long to run. For example, I once had a backup task that was scheduled every hour. About six months after I set it up it started taking longer than an hour to run!
There is https://eyewitness.io - which is for monitoring server cron tasks, queues and websites. It makes sure each of your cron jobs run when they are supposed to, and alerts you if they failed to be run.
Suppose I include a rather long-running startup task into my Azure role - running something like up to several minutes. What happens if the startup task runs "too long".
I'm currently testing on Compute Emulator and observe the following.
I have a 450 megabytes .zip file together with Info-Zip unzip. The startup task unzips the archive. Deployment starts and I look into Task Manager. Numerous service processes start, then unzip.exe is run. After about two minutes all those processes stop and then start anew and unzip.exe starts again.
So it looks like a deployment is allowed to run for about two minutes, then is forcefully reset and started again.
Is this the expected behavior? Does it persist on real cloud? Are there any hard limits on how long a role startup can take? How do I address this situation except moving the unpacking into RoleEntryPoint.OnStart()?
I had the same question, so tried an experiment. I ran a Startup Task - taskType="simple" so that it would block the Roles from beginning to execute - and let it run for 50 hours. The Fabric Controller did not complain and the portal did not show any error. It finished its long "do nothing" loop after the 50 hours was up, then this Startup Task exited, and my Web Role started up fine.
So my emperical test says Startup Tasks can take a long time! At least 50 hours.
This should inform the load balancer that your process is still busy:
http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.serviceruntime.roleinstancestatuscheckeventargs.setbusy.aspx
I have run startup tasks that run for a pretty long time (think 20-30 mins) and the role is simply in a 'Busy' state. I don't think there is a hard limit for how long the role will stay in that state as long as the Startup task is still executing and did not exit with a non-zero return code (in fact, this is a gotcha for most first time startup task creators when they pop a prompt). The FC is technically still running just fine, so there would be no reason to 'recover' the role (i.e. heartbeats are still going).
The dev emulator just notices when the role hasn't started and warns you. If you click the 'keep waiting' option, it will continue to run the Startup task to completion. The cloud does not do this of course (warn you).
Never tried a task that ran super long, so there might be a very long limit. I seem to recall 3 hrs was a magic number in some timeout cases like role recycles, but I have never tried...
There are some heartbeats that the Azure Fabric Agent will do against the role. If these are not acknowledged (say a long-running blocking process), this could cause the role to be flagged as unavailable.
You might try putting your startup process into a background thread that runs independently. This should help you keep the role from being recycled while the process is starting up. Just keep in mind you may need to make some adjustments if you get requests before the role fully starts up. There's also a way (that I can't seem to recall ATM) to flag the role and take it out of the load balancer temporarially while your process completes.