if an aws spot instance is stopped by AWS and then restarts will it just start where it left off? - amazon-ec2

I am running luigi, a pipeline manager which is processing 1000 tasks. Currently I poll for the AWS termination notice. If it is present then I requeue the job; wait 30 minutes; then launch a new server starting all the tasks from scratch. However sometimes it restarts the same job multiple times which is inefficient.
Instead I am considering using create_fleet with InstanceInterruptionBehaviour=Stop? If I do this then when it restarts will it still be running the luigi daemon and retain the state of all the tasks?

All InstanceInterruptionBehaviour=Stop does is effectively shutdown your EC2 instance rather than terminate it. Since the "persistent" setting is required in addition to EBS storage" you will keep all the data currently on the attached EBS volumes at the time of the instance stop.
It is completely dependent on the application itself (Luigi in this case) to be able to store the state of its execution and pick back up from where it left off. For one, you'll want to ensure you enable the service daemon to automatically start upon system start (example):
sudo systemctl enable yourservice

Related

Does ECS update-service command marks the container instance state to draining when use with the --force-new-deployment option?

The command:
aws ecs update-service --service my-http-service --task-definition amazon-ecs-sample --force-new-deployment
As per AWS docs: You can use this option (--force-new-deployment) to trigger a new deployment with no service definition changes. For example, you can update a service's tasks to use a newer Docker image with the same image/tag combination (my_image:latest ) or to roll Fargate tasks onto a newer platform version.
My question, if I use '--force-new-deployment' (as I will use the exisiting tag or definition), will the underline 'ECS Instance' automatically set to DRAINING state, so that any new task (if any) will not start in the EXISTING ecs-instance that is suppose to go away during rolling-update deployment strategy (or deployment controller) ?
In other words, will there be any chance:
For a new task to be created on the existing/old container instance, that is suppose to go away during rolling update.
Also, what would happen with the ongoing task that is running on this existing/old container instance, that is suppose to go away during rolling update.
Ref: https://docs.aws.amazon.com/cli/latest/reference/ecs/update-service.html
Please note that no Container instance is going anywhere with this 'update-service' command. This command will only create a new Deployment under the ECS service and when the new tasks become healthy, remove the old task(s).
Edit 1:
How about the request that were served by old task?
I am assuming the tasks are behind an Application Load Balancer. In this case, old tasks will be deregistered from the ALB.
Note: In the following discussion, target is the ECS Task.
To give you a brief description on how the Deregistration Delay works with ECS, the following is the sequential order when a task connected to an ALB is stopped. It can be due to a scale in event, deployment of a new task definition, decrease of the number of tasks, force deployment, etc.
ECS sends DeregisterTargets call and the targets change the status to "draining". New connections will not be served to these targets.
If the deregistration delay time elapsed and there are still in-flight requests, the ALB will terminate them and clients will receive 5XX responses originated from the ALB.
The targets are deregistered from the target group.
ECS will send the stop call to the tasks and the ECS-agent will gracefully stop the containers (SIGTERM).
If the containers are not stopped within the stop timeout period (ECS_CONTAINER_STOP_TIMEOUT by default 30s) they will force stopped (SIGKILL).
As per the ELB documentation [1] if a deregistering target has no in-flight requests and no active connections, Elastic Load Balancing immediately completes the deregistration process, without waiting for the deregistration delay to elapse. However, even though target deregistration is complete, the status of the target will be displayed as draining and you can see the registered Target of the old task is still present in the TargetGroup console with status as draining until the deregistration delay elapses.
The design of ECS is to stop the Task after the completion of Draining process as mentioned in the ECS document [2] and hence the ECS Service waits for the TargetGroup to complete the Draining process before issuing the stop call.
Ref:
[1] Target Groups for Your Application Load Balancers - Deregistration Delay - https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-target-groups.html#deregistration-delay
[2] Updating a Service - https://docs.aws.amazon.com/AmazonECS/latest/developerguide/update-service.html

Stopping ec2 to scale-down before it completes the process running on it

We have an application which runs on ec2 instances that we use as docker host in ecs cluster. There are multiple tasks running on each ec2. Each task picks up one message from SQS and process some event(which convert data from one format to other and upload it to a file system), which may take from few seconds to 12-15hours depending on the size of the data it contains. Once an event processing is completed task is stopped and for new message(event) new task is created. Whenever there are huge number of messages in SQS we are scaling-up the instances to process the messages (to avoid wait time). When (number of messages) < (number of running tasks) for certain duration then we need to scale-down i.e. we need to terminate ec2 instances.
For ec2 scale-down we need to make sure there is no task running i.e. container is not processing any event on it. There is no way to found out which ec2 are free(not processing any event) so we marked container to DRAINING state and then TERMINATING ec2. But while we mark any container to draining state, task running on it, are stopped(hence event processing is killed in between and data is lost). Is there any why we can complete the process before tasks are stopped or anyone can suggest better approach.

Run batch file on windows ec2 instance then shutdown instance?

I have a .bat file on a windows ec2 instance I would like to run every day.
Is there any way to schedule the instance to run this file every day and then shut down the ec2 instance without manually going to the ec2 management console and launching the instance?
There are two requirements here:
Start the instance each day at a particular time (This is an assumption I made based on your desire to shutdown the instance each day, so something needs to turn it on)
Run the script and then shutdown
Option 1: Start & Stop
Amazon CloudWatch Events can perform a task on a given schedule, such as once-per-day. While it has many in-built capabilities, it cannot natively start an instance. Therefore, configure it to trigger an AWS Lambda function. The Lambda function can start the instance with a single API call.
When the instance starts up, use the normal Windows OS capabilities to run your desired program, eg: Automatically run program on Windows Server startup
When the program has finished running, it should issue a command to the Windows OS to shutdown Windows. The benefit of doing it this way (instead of trying to schedule a shutdown) is that the program will run to completion before any shutdown is activated. Just be sure to configure the EC2 instance to Stop on Shutdown (which is the default behaviour).
Option 2: Launch & Terminate
Instead of starting and stopping an instance, you could instead launch a new instance using an Amazon CloudWatch Events schedule.
Pass the desired PowerShell script to run in the instance's User Data. This script can install and run software.
When the script has finished, it should call the Windows OS command to shutdown Windows. However, this time configure Terminate on Shutdown so that the instance is terminated (deleted). This is fine because the above schedule will launch a new instance next time.
The benefit of this method is that the software configuration, and what should be run each time, can be fully configured via the User Data script, rather than having to start the instance, login, change the scripts, then shutdown. There is no need to keep an instance around just to be Stopped for most of the day.
Option 3: Rethink your plan and Go Serverless!
Instead of using an Amazon EC2 instance to run a script, investigate the ability to run an AWS Lambda function instead. The Lambda function might be able to do all the processing you desire, without having to launch/start/stop/terminate instances. It is also cheaper!
Some limitations might preclude this option (eg maximum 5 minutes run-time, limit of 500MB disk space) but it should be the first option you explore rather than starting/stopping an Amazon EC2 instance.

Terminate google cloud compute engine instance with shell/bash script

I am using google cloud compute engine for some computational intense tasks (32 parallel processes). My tasks sometimes finished in mid night, and I am wondering is there a way to stop the instance once all my processes stop? I prefer to make a shell script to monitor all my processes and stop the instance when everything is finished.
halt or shutdown or poweroff does not works for me, as my command only submit jobs. The command finished immediately while all processes (computing tasks) kept running on the backend. If I put halt or shutdown at the end of my command line, the instance simply shut down as I entered the command
Take a look at How to automatically exit/stop the running instance.
To summarize, you can simply run halt or shutdown -h now. Once the operating system halts the instance will terminate and you will no longer be charged.
Alternatively if you've started the instance with the appropriate permissions/scope you could issue the gcloud compute instances stop command:
https://cloud.google.com/sdk/gcloud/reference/compute/instances/stop
I typed:
gcloud compute instances stop [virtual-machine-instance]
Therafter, you specify the zone [zone] to confirm terminating instance

If i Stop an Amazon EC2 instance, is this saved?

If I stop an EC2 instance, rather than terminate, will this image be saved in my account and be available to be used at a later stage? ..as I noticed terminated instances eventually dissappear, would a stopped instance be available to boot up and start when ever?
Also regarding charging, I assume the cost for having a stopped instance would be charged per GB similar to a custom AMI?
Also another possibly simple question, if I shutdown a machine over SSH or via a script, does this initiate a termination or just a stop an instance (I assume it terminates the instance).
Thanks
If you terminate the EBS backed instance, it will remove it from the list of running instance, including it's allocated EBS volume. Unless you set the instance attribute not to delete the volume.
If you only stop, it will changed to stopped status and you can start it again later.
If you shutdown a machine, it default's to stop.
A good read to protect your instance see: http://alestic.com/2010/01/ec2-instance-locking
If you stop an instance based on EBS, then the instance will terminate automatically but you'll be charged for EBS storage until you delete the EBS.

Resources