I can't seem to find any documentation anywhere on how this is possible.
If I knife bootstrap a new AWS instance, and then a few weeks later that machine goes offline, I would like my Chef server to detect this, and bootstrap a new machine.
I understand how to make chef create a new AWS instance, and bootstrap the instance.
I do not know how to make chef detect that a previously deployed box is no longer available.
I can use the chef API to search for existing nodes. But I do not know how to check that those nodes are still accessible over network, or how to run this check regularly.
I believe I am missing something simple? Most resources I have found on this issue assume that this doesn't need to be discussed, as it is self-evident?
Related
As per Hashicorp documentation on Nomad+Consul, consul service mesh cannot be run on MacOS/Windows, since it does not support bridge network.
https://www.nomadproject.io/docs/integrations/consul-connect
What is the recommended way to setup a local development environment for Nomad+Consul?
I'd suggest to have a look at setting up your local environment using Vagrant (which is also a product for Hashicorp) and Virtual box. There are plenty examples online, for example
Here is one of the most recent setup with Nomad and Consul, although it is not parametrised much.
Here is one with the core Hashicorp stack, i.e. Nomad, Vault and Consul. This repo is quite old but it merely means that it uses old versions of binaries, which should be easy to update.
Here is one with only Vault and Consul, but you can add Nomad in a similar way. In fact, this Vargrant setup and how files are structured seems to me pretty close to the one above
I've run the first two previous week with a simple
vagrant up
and it worked almost like a charm. I think, I needed to upgrade my VirtualBox and maybe run vagrant up multiple times because of some weird run time errors which I didn't want to debug)
Once Vagrant finishes build you can
vagrant ssh
to get inside created VM, although configs are setup with mounting volumes/syncing files and all UI components are also exposed at the default ports.
I have an 8-cpu server and I installed Centos 7 on it. I would like to dynamically and programmatically spin up and down VM nodes to do work, ex. Hadoop nodes.
Is the technology I use for this Vagrant or Puppet, or something else? I have played around with Vagrant, but it appears that every new node requires a new directory in the file system, I can't just spin up a new VM as far as I can tell with an API call, I think. And it doesn't look like there's even a real API for Vagrant, just machine-readable output. And if I understand it properly, Puppet deals with configuration management for pre-existing nodes.
Is either of these the correct technology to use or is there something else that is more fitting to what I want to do?
Yes, you can use vagrant to spin up a new vm. Configuration of that particular vm can be done using puppet. Take a look at: https://www.vagrantup.com/docs/provisioning/puppet_apply.html
And if you're problem is having separate directories for each vm, you're looking for a multimachine setup: https://www.vagrantup.com/docs/multi-machine/
For an example using the multiserver setup take a look at https://github.com/mlambrichs/graphite-vagrant/blob/master/Vagrantfile
In the config directory you'll find a yaml file which defines an array that you can use to loop over different vm's.
So, using knife we can create an EC2 instance and get its corresponding Chef node to show up on the Chef server, all with a single command. So far so good!
But do you have a tool or workflow for validating the link between instance and node? I had manually deleted an EC2 instance and so ended up with an orphaned Chef node.. it seems to me if I had a complicated network of instances I could've missed that. Or do you entirely bypass this by having a hard rule that no-one ever messes with EC2 instances directly, or something similar?
I'm new to Chef if it's not obvious, curious to understand how using Chef scales.
Chef records when a node last checked in under node['ohai_time'] so you can use that to filter down results when using Chef for service discovery. A better option is to not use Chef for service discovery in favor of a tool built for it like ZooKeeper or Consul. Other than that, having orphaned data isn't really a huge deal so I generally ignore it. In the past I've also hooked up ASG scaling events to remove the associated node and client. I've also seen people put a script on the machine to be run on shutdown that removes its own node and client, though this can still leave orphans every now and then.
My organization's website is a Django app running on front end webservers + a few background processing servers in AWS.
We're currently using Ansible for both :
system configuration (from a bare OS image)
frequent manually-triggered code deployments.
The same Ansible playbook is able to provision either a local Vagrant dev VM, or a production EC2 instance from scratch.
We now want to implement autoscaling in EC2, and that requires some changes towards a "treat servers as cattle, not pets" philosophy.
The first prerequisite was to move from a statically managed Ansible inventory to a dynamic, EC2 API-based one, done.
The next big question is how to deploy in this new world where throwaway instances come up & down in the middle of the night. The options I can think of are :
Bake a new fully-deployed AMI for each deploy, create a new AS Launch config and update the AS group with that. Sounds very, very cumbersome, but also very reliable because of the clean slate approach, and will ensure that any system changes the code requires will be here. Also, no additional steps needed on instance bootup, so up & running more quickly.
Use a base AMI that doesn't change very often, automatically get the latest app code from git upon bootup, start webserver. Once it's up just do manual deploys as needed, like before. But what if the new code depends on a change in the system config (new package, permissions, etc) ? Looks like you have to start taking care of dependencies between code versions and system/AMI versions, whereas the "just do a full ansible run" approach was more integrated and more reliable. Is it more than just a potential headache in practice ?
Use Docker ? I have a strong hunch it can be useful, but I'm not sure yet how it would fit our picture. We're a relatively self-contained Django front-end app with just RabbitMQ + memcache as services, which we're never going to run on the same host anyway. So what benefits are there in building a Docker image using Ansible that contains system packages + latest code, rather than having Ansible just do it directly on an EC2 instance ?
How do you do it ? Any insights / best practices ?
Thanks !
This question is very opinion based. But just to give you my take, I would just go with prebaking the AMIs with Ansible and then use CloudFormation to deploy your stacks with Autoscaling, Monitoring and your pre-baked AMIs. The advantage of this is that if you have most of the application stack pre-baked into the AMI autoscaling UP will happen faster.
Docker is another approach but in my opinion it adds an extra layer in your application that you may not need if you are already using EC2. Docker can be really useful if you say want to containerize in a single server. Maybe you have some extra capacity in a server and Docker will allow you to run that extra application on the same server without interfering with existing ones.
Having said that some people find Docker useful not in the sort of way to optimize the resources in a single server but rather in a sort of way that it allows you to pre-bake your applications in containers. So when you do deploy a new version or new code all you have to do is copy/replicate these docker containers across your servers, then stop the old container versions and start the new container versions.
My two cents.
A hybrid solution may give you the desired result. Store the head docker image in S3, prebake the AMI with a simple fetch and run script on start (or pass it into a stock AMI with user-data). Version control by moving the head image to your latest stable version, you could probably also implement test stacks of new versions by making the fetch script smart enough to identify which docker version to fetch based on instance tags which are configurable at instance launch.
You can also use AWS CodeDeploy with AutoScaling and your build server. We use CodeDeploy plugin for Jenkins.
This setup allows you to:
perform your build in Jenkins
upload to S3 bucket
deploy to all the EC2s one by one which are part of the assigned AWS Auto-Scaling group.
All that with a push of a button!
Here is the AWS tutorial: Deploy an Application to an Auto Scaling Group Using AWS CodeDeploy
When autoscaling my EC2 instances for application, what is the best way to keep every instances in sync?
For example, there are custom settings and application files like below...
Apache httpd.conf
php.ini
PHP source for my application
To get my autoscaling working, all of these must be configured same in each EC2 instances, and I want to know the best practice to sync these elements.
You could use a private AMI which contains scripts that install software or checkout the code from SVN, etc.. The second possibility to use a deployment framework like chef or puppet.
The way this works with Amazon EC2 is that you can pass user-data to each instance -- generally a script of some sort to run commands, e.g. for bootstrapping. As far as I can see CreateLaunchConfiguration allows you to define that as well.
If running this yourself is too much of an obstacle, I'd recommend a service like:
scalarium
rightscale
scalr (also opensource)
They all offer some form of scaling.
HTH