Add an EC2 instance as a worker to Amazon MWAA - amazon-ec2

I am currently using Amazon MWAA as my Airflow. I want to have 2 types of workers nodes but currently MWAA doesn't support it. I want to have:
High Compute Optimized CPU workers
GPU workers
I want to create different queues for both the worker types and submit jobs to these workers nodes.
Is it possible to add an existing EC2 instance (say GPU instance) to MWAA? I only see Start and Stop EC2 operators available.
Does anyone have any pointers on this?

If you have an EKS then its possible to define a GPU pool and using KubernetesPodOperator you can run a docker under a gpu pool.
Another solution its ECS (easier to define). you can see a good example to run gpu in Airflow in this article

Related

deploy bolts/spout to a specific supervisor

We are running a storm application using a single type on instance in AWS and a single topology to run our system.
This is causing some resource limitation issues.
The way we want to address this is by splitting our IO intense bolts into a cluster of a few dozens t1.small machines (for example) and all our CPU intense bolts to two large machines with lots of cpu & memory.
Basically what i am asking is, is there a way to start all this supervisors and then deploy one topology that include cpu intense bolts on the big machines and to the small machines the deploy IO bolts?
You can implement a custom scheduler using interface IScheduler.
See
http://www.exogeni.net/2015/04/enabling-site-aware-scheduling-for-apache-storm-in-exogeni/
https://dcvan24.wordpress.com/2015/04/07/metadata-aware-custom-scheduler-in-storm/
https://github.com/xumingming/storm-lib/blob/master/src/jvm/storm/DemoScheduler.java

What to do when ECS-agent is disconnected?

I have an issue that from time to time one of the EC2 instances within my cluster have its ECS-agent disconnected. This silently removes the EC2 instance from the cluster (i.e. not eligible to run any services anymore) and silently drains my cluster from serving servers. I have my cluster backed with an autoscaling group, spawning servers to keep up the healthy amount. But the ECS-agent'disconnected servers are not marked as unhealthy, so the AS-group thinks everything is alright.
I have the feeling there must be something (easy) to mitigate this, or I'm having a big issue with choosing ECS and using it in production.
We had this issue for a long time. With each new AWS ECS-optimized AMI it got better, but as of 3 months ago it still happened from time to time. As mcheshier mentioned make sure to always use the latest AMI or at least the latest aws ecs agent
The only way we were able to resolve it was through:
Timed autoscale rotations
We would try to prevent it by scaling up and down at random times
Good cloudwatch alerts
We happened to have our application set up as a bunch of microservices that were all queue (SQS) based. We could scale up and down based on queues. We had decent monitoring set up that let us approximate rates of queues across number of ECS containers. When we detected that the rate was off we would rotate that whole ECS instance. Ie. Say our cluster deployed 4 running containers of worker-1. We approximate that each worker does 1000 messages per 5 minutes. If our queue rate was 3000 per 5 minutes and we had 4 workers, then 1 was not working as expected. We had some scripts set up in lambda to find the faulty one and terminate the entire instance that ran that container.
I hope this helps, I realize it's specific to our in-house application, but the advice I can give you and anyone else is to take the initiative and put as many metrics out there as you can. This will let you do some neat analytics and look for kinks in the system, this being one of them.

running multiple instances of a spark app on mesos through marathon

I am trying to run a spark streaming app through marathon on mesos and this job eventually stores some counts into an instance of cassandra. My question is should I set number of instances (on marathon) for this app to 2 (for HA); however, the issue is wouldn't the 2nd instance be just a replica of the first one and processing and results would be duplicated?
No you don't set the number of instances to 2 for HA. Marathon will re-start any app that due to whatever reasons has gone down. It is a good practice to implement health checks, though.

How to configure memory based Auto Scaling on Amazon EC2?

I am deploying a rails application on EC2 instances, where I want to configure Auto Scaling to be used with an Elastic Load Balancer. On a particular threshold I want to spawn a new instance. While configuring trigger for auto scaling group, we have options for CPU Utilization, Network or disk IO; but being a rails application, I will face resource crunch on memory rather than CPU or IO.
Has anyone configured EC2 Auto Scaling for a rails application successfully? What is the preferred way of using AWS with rails?
FYI: I am using passenger as application server.
Thanks for your time.
I have not done it with Rails, but with java in Tomcat.We used tomcat valves/scripts to detect the memory usage and post it in Amazon cloudWatch as Custom cloud watch metrics. You can create an scale out trigger based on cloudwatch alarm monitoring this metric.
Some sections of the above technique can even be overlapped for rails.
Actually, I think you should try to tune your Passenger configuration based on the instance type you are using (here is an article about this: http://blog.scoutapp.com/articles/2009/12/08/production-rails-tuning-with-passenger-passengermaxprocesses). This should ensure that at full load you are using all the RAM available but you are not spawning more passenger instances than RAM available.
This, in my experience leads to saturating all the resources (CPU & RAM) and you could trigger an autoscale policy based on CPU usage. You also should tweak the instance type in order to achieve the best performance (I've used cc1.xlarge instance with a fair amount of success).
If you're set on autoscaling based on RAM, you should be able to create a metric on CloudWatch that is monitoring RAM usage and autoscale using that metric. Creating a metric is just publishing the metric data at regular intervals using the CloudWatch API (http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/CloudWatch/Metric.html). You could create a rails background task that runs every minute and publishes the metric data.
i think you can use the new signaling feature, monitor the memory metric and start the instance using api:
http://aws.typepad.com/aws/2010/12/amazon-cloudwatch-alarms.html

MapReduce on AWS

Anybody played around with MapReduce on AWS yet? Any thoughts? How's the implementation?
It's easy to get started.
Here's a FAQ: http://aws.amazon.com/elasticmapreduce/faqs/
And here's the Getting Started Guide: http://docs.amazonwebservices.com/ElasticMapReduce/latest/GettingStartedGuide/
If you have an EC2 account already, you can enable MapReduce and have a sample application up and running in less than 10 minutes using the AWS Management Console.
I did the pre-packaged Word Count sample application, which returns a count of each word contained in about 20 MB of text. You can provision up to 20 instances to run concurrently, though I just used 2 instances and the job completed in about 3 minutes.
The job returns a 300 KB alphabetized list of words and how often each word appears in the sample corpus.
I really like that MapReduce jobs can be written in my choice of Perl, Python, Ruby, PHP, C++, R, or Java. The process was painless and straightforward, and the interface gives good feedback on the status of your instances and the job flow.
Be aware that, since AWS charges for a full hour when an instance is created, and since the MapReduce instances are automatically terminated at the end of the job flow, the cost of multiple fast-running job flows can add up quickly.
For example, if I create a job flow that uses 20 instances and returns results in 15 minutes, and then re-run the job flow 3 more times, I'll be charged for 80 hours of machine time even though I only had 20 instances running for 1 hour.
You also have the possibility to run MapReduce (Hadoop) on AWS with StarCluster. This tool configures the cluster for you and has the advantage that you don´t have to pay the extra Amazon Elastic MapReduce Price (if you want to reduce your costs) and you could create your own Image (AMI) with your tools (this could be good if the installation of the tools can´t be done by a bootstrap script).
It is very convenient because you don't have to administer your own cluster. You just pay per use so I think it is a good idea if you have a job that needs to run once in a while. We are running Amazon MapReduce just once a month so, for our usage, it is worth it.
However, as far as I can tell, a drawback of Amazon Map Reduce is that you can't tell which Operating System is running, or even its version. This caused me problems running c++ code that compiled with g++ 4.44, some of the OS images does not support cUrl library, etc.
If you don't need any special libraries for your use case, I would say go for it.
Good answer by MB.
To be clear: you can run Hadoop clusters in two ways:
1) Run it on Amazon EC2 instances. This means that you have to install it, configure it, terminate it, etc.
2) Run it using Elastic MapReduce, or EMR: this is an automated way to run an Hadoop cluster on Amazon Web Services. You pay a little extra on top of the basic cost for EC2, but you don't need to manage anything: just upload your data, then your algorithm, then crunch. EMR will shut down the instances automatically once your jobs are finished.
Best,
Simone
EMR is the best way to use available resources with a very little added cost over EC2 however you will how time saving and easy it is. Most of the MR implementation on Cloud are using this model i.e. Apache Hadoop on Windows Azure, Mortar Data etc.. I have worked on both Amazon EMR and Apache Hadoop on Windows Azure and found incredible to use.
Also, depending on the type / duration of jobs you plan to run, you can use AWS spot instances with EMR to get better pricing.
I am working with AWS EMR. It is pretty neat. I mean once you start up their cluster and login into their Master node. You can play around with the hadoop directory structure. And do pretty cool things.. If you have a edu account don;t forget to apply for a research grant. They give unto 100$ free credits to use their AWS.
AWS EMR is a good choice when you use S3 storage for your data.
It provides out of the box integration with S3 for loading files and posting processed files.
In use cases where you need to run the job on demand, you are saved from the cost of running the whole cluster all the time, this really helps you save on instance hours.
Leveraging the above advantage, one can use AWS lambda to spawn event driven clusters.

Resources