Using Terraform to provision ECS, Elastic, SQS, Lambda - elasticsearch

I am fairly new to Terraform and had a few questions -
- I want to provision a ECS cluster with EC2 launch type running containers. I can do this with Terraform.
- I also need a SQS, Lambda talking to the SQS and feeding data to an Elastic cluster who then will interface with another microservice.
My question is, how should I provision this ?
Is it possible to do all the above in a single ECS cluster using Terraform ?
Would it be better to have the ECS cluster separate and have the SQS, Lambda, Elastic cluster be independent components ? And have the final microservice run as a separate EC2 container (non-ECS) ?
What is the better way of approaching this infrastructure ?

It's a pretty open-ended question so I can provide some guidance based on experience, but please use your best judgement in applying my experience to your individual use case.
What I found in general is that trying to put unrelated things in the same Terraform module (I suppose that's what you mean when you say "components") is asking for trouble. There's a bunch of things that belong together in an ECS cluster - IAM roles and policies, autoscaling group, cluster itself... see for example this module. But SQS, Lambda and Elastic don't feel like they're related. I'd put them separately. Same with your non-ECS microservice.
The other thing I'd want to emphasize is that Terraform is not a deployment tool and should not be used to do that. Terraform shines when defining things that change infrequently and independently. The latter is pretty important because Terraform is not transactional and won't roll back anything if things go wrong. If you need more dynamics and a stronger guarantee of integrity in the AWS world, I'd suggest using Terraform to define a CloudFormation stack and running deployments through the latter.

Related

dynamic ec2 resourcing in declarative cloud formation/terraform

We are moving our infrastructure to cloud formation since it's much easier to describe the infrastructure in a nice manner. This works fantastically well for things like security groups, routing, VPCs, transit gateways.
However, we have two issues which we are struggling with and I don't think fit the declarative, infrastructure-as-code paradigm which things like terrafrom and cloud formation are.
(1) We have a business requirement where we run a scheduled batch at specific times in the day. These are very computationally intensive. To save costs, we run these on an EC2 which is brought up at that time, then torn down when the batch is finished. However, this seems to require a temporary change to the terraform/CF files, then a change back. Is there a more native way of doing this?
(2) We dynamically store and allow to be edited by clients their firewalling rules on their load balancer (ALB). This information cannot be stored in the terraform/CF files since it can be changed by clients on demand.
Is there a way of properly doing these things in CF/Terraform?
(1) If you have to use EC2, you could create a Lambda that would start your EC2 instances. Then, create a CloudWatch Event that triggers the Lambda at your specified date / time. For more details you can see https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/. Once the job is done, have your EC2 shut itself down using the awssdk or awscli.
Alternatively, you could use AWS Lambda to run your batch job. You only get charged when the Lambda runs. Likewise, create a CloudWatch Event rule that schedules the Lambda.
(2) You could store the firewall rules in your own DB and modify the actual ALB SG rules using the awssdk. I don't think it's a good idea to store these things in Terraform/CF. IMHO Terraform/CF are great for declaring infrastructure but won't be a good solution for resources that are dynamically changing, especially by third parties like your clients.

EC2 Container Service vs Apache Mesos

We are looking to use Docker container to run our batch jobs in a cluster enviroment.
We are evaluating to use AWS ECS Container Service/Chronos/Mesos.
As far as I know, Apache Mesos has some overlapping features/purpose that EC2 has, like cluster management. Chronos is a distributed scheduler.
I am having dificult to correlate all this technologies to create a architecture!
EC2 service replace Mesos? What about the scheduler?
We are a small team will little experience in cluster development. Which stack is better for our batch processing?
EDIT
I make a huge edit, and i think now i understand the architecture:
This is a sample picture with two cluster been managed by Mesos.
Reading the ECS Container Service documentation(http://docs.aws.amazon.com/AmazonECS/latest/developerguide/scheduling_tasks.html), AWS is on the way of integrete ECS with the Mesos Apache Framework. So I imagine that using in the future, we can use the mesos framework to manage the resources in the ECS Cluster. So it is going to be possible to use Chronos (for batch scheduling) and Marathon (for long running app.)
EDIT
At this moment, we dont have distributed jobs running, like hadoop jobs or sparks jobs. Our job are much simpler, running on single instances of EC2. We are planning to use Docker to run our batch running jobs.
I'd argue it depends on the type of batch jobs, but the Apache Mesos ecosystem is certainly more flexible than ECS to accomodate your needs. The flexibility comes from the fact that Mesos uses a so called two-level scheduling model, which is a fancy name for it outsources the scheduling decision into frameworks (rather than trying to implement each and every existing and future workload scheduling strategy in its core, itself).
You mentioned one such a framework already, Chronos, which is a good working horse, just maybe don't use the dependencies for jobs, ok? Then there is another great batch framework called Cook. Depending on your needs (for example SQL-based batch report generation) you could use Apache Spark. And so on and so forth.
BTW, did I mention already that with Mesos you don't risk a vendor lock-in, while being able to deploy it, depending on your needs, fully in one cloud (such as AWS), hybrid cloud (say AWS and GCP/Azure) or on-premises?
UPDATE: to clarify, of course Mesos has first-class Docker support.

Heroku-like deployment and environment configuration via EC2

I really like the approach of a 12factor app, which you are kinda forced into, when you deploy an application to Heroku. For this question I'm particularly interested in setting environment variables for configuration, like one would do on Heroku.
As far as I can tell, there's no way to change the ENV for one or multiple instances within the EC2 console (though it's seems to be possible to set 5 ENV vars when using elastic beanstalk). Therefore my next bet on an Ubuntu based system would be to use /etc/environment, /etc/profile, ~/.profile or just the export command to set ENV variables.
Is this the correct approach or am I missing something?
And if so, is there a best practice on how to do it? I guess I could use something like Capistrano or Fabric, get a list of servers from the AWS api, connect to all of them and change the mentioned files/call export. Though 12factor is pretty well known, I couldn't find any blog post describing how to handle the ENV for a non-trivial amount of instances on EC2. And I don't want to implement such a thing, if somebody already did it very well and I just missed something.
Note: I want a solution without using elastic beanstalk and I don't care about git push deployment or any other Heroku-like feature, this is solely related to app configuration.
Any hints appreciated, thanks!
Good question. There are many ways you can approach your deployment/environment setup.
One thing to keep in mind is that with Heroku (or Elastic Beanstalk for that matter) you only push the code. Their service takes care of the scalability factor and replication of your services across their infrastructure (once you push the code).
If you are using fabric (or capistrano) you are using a push model too, but you have to take care of all the scalability/replication/fault tolerance of your application.
Having said that, if you are using EC2, in my opinion it's better if you leverage AMIs, Autoscale and Cloudformation for your deployments. This is the beauty of elasticity and Virtualization in that you can think of resources as ephemeral. You can still use fabric/capistrano to automate the AMI builds (I use Ansible) and configure environment variables, packages, etc. Then you can define a Cloudformation stack (with a JSON file) and in it you can add an autoscaling group with your prebaked AMI.
Another way of deploying your app is to simply use the AWS Opsworks service. It's pretty comprehensive and it has a lot of options but it may not be for everybody since some people may want a bit more flexibility.
If you want to go 'pull' model you can use Puppet, Chef or CFEngine. In this case you have a master policy server somewhere in the cloud (Puppetmaster, Chef Server or Policy Server). When a server gets spun up, an agent (Puppet agent, Chef Client, Cfengine agent) connects to its master to pick up its policy and then executes it. The policy may contains all the packages and environment variables that you need for your application to function. Again, it's a different model. This model scales pretty well but it depends on how many agents the master can handle and how you stagger the connections from the agents to the master. You can load balance multiple masters too if you want to scale to thousands of servers or you can just simply use multiple masters. From experience, if you want something really "fast" Cfengine works pretty good, there's a good blog comparing the speed of Puppet and CFengine here: http://www.blogcompiler.com/2012/09/30/scalability-of-cfengine-and-puppet-2/
You can also go "push" completely with tools like fabric, Ansible, Capistrano. However, you are constrained by how much a single server (or laptop) can handle multiple connections to thousands of servers that its trying to push to. This is also constrained by network bandwidth, but hey you can get creative and stagger your push updates and perhaps use multiple servers to push. Again it works and it's a different model so it depends which direction you want to go.
Hope this helps.
If you dont need beanstalk, you can look at AWS Opsworks (http://aws.amazon.com/opsworks/). Ideal for Web worker kind of deployment scenerios. You can pass any variable from outside the code here (even Chef recipies)
It's might be late but they what we are doing.
We have python script that take env var in Json and send that to as post data to another python script that convert those vars to ymal file.
After that we use Jenkins pipline groovy using multibranch. Jenkins do all the build and then code deploy copies those env vars to ec2 instanced running in autoscaling.
Off course we are doing some manapulation from yaml to simple text file so code deploy can paste it on /etc/envoirments

Cloudformation extra when compared with Chef

What can I do with Cloudformation that I can not do with chef? It looks like chef support spawning node in the ec2 and I can read back the information of the spawned nodes (ip,..) so why would still need cloudformation for?
Very little, but what it does makes it worth using.
The primary advantage of CloudFormation is that Amazon ties created resources together. In other words if your application comprises a db, four webservers, an autoscaling group, a launch configuration, ingress rules, some VPC subnets, an internet gateway or two, and a VPN connection, you can manage them in a single place, as a CloudFormation stack. Want to shoot them? That's easy. Kill the stack and every resource dies with it.
Sure you could technically do this with Chef and the Amazon API. CloudFormation is little more than a way of generating extremely sophisticated sets of AWS API calls + Magic (tm), so you could always roll your own. Netflix sort of did that with their OS tool, Asgard, and RightScale and other services are more or less that. If those softwares don't meet your needs or if you can't afford them, CloudFormation is a nice supplement to AWS deployments. In fact, you don't even need to rely on it. It's quite simple to launch chef-solo from CloudFormation, which allows you to leverage the advantages of both.

Elasticsearch deployment environment setup

We are working on setting up our elasticsearch backend for a production environment. Up until a few weeks ago, we were using Solr, but we decided to use Elasticsearch for a few reasons, but the biggest reason is for the distributed nature of the backend.
With that said, we've been looking for some documentation and best practices on deploying elasticsearch using amazon's services.
For the moment, we were considering using a extra-large box and then scaling out from there, but we aren't sure that is the best approach. For example, it may be better to have three mediums than one extra-large.
We intend to index around 100K to 150K documents per day up to around ten million docs.
The question is, can anyone provide a general environment / deployment diagram for elasticsearch or best practices in general?
There's some docs for elasticsearch that talk about EC2 deployment. There's an autodiscovery plugin based on EC2 tags or security groups or whatever you like. You can also choose S3 for persistence, although that may not really be necessary.
I'd advise launching it in a VPC so you can have permanent internal IPs, in regular EC2 your internal IPs will change with every reboot even if you're using Elastic IPs.

Resources