I currently manage ~20 machines located around the world, behind a variety of firewalls using ansible operating in pull mode (all machines pull the playbook from a git+ssh repository). I'd like better reporting of the status of the machines so I am looking into Ansible Tower.
As far as I can tell Tower only supports push mode. The docs for ansible tower are not clear - can it also manage machines that run in pull mode? that is can each machine, for example, phone home to tower to retrieve configuration and report its results rather than requiring tower to push to those machines?
Alternative strategies using something like autossh + reverse tunnel are not options due to the variances of the remote machines firewalls and IT departments.
Yes, "pull" with Tower is used to initialize machines- for instance, when AWS creates them with autoscaling.
Related
Need to change the labels of worker nodes with a predefined combination of alphanumeric characters (example: machine01) as and when they join the cluster (again and again if node(s) leave or when new nodes join the cluster). Is it possible to do it using ansible or we need to setup a cronjob. If possible using ansible, I would like to have a hint on how we can run a playbook once and keep it active in background to keep on checking new node labels. Which is more cheap computationally.
How we can run a playbook once and keep it active (annot.: running) in background to keep on checking new node labels?
Since Ansible is a push based Configuration Management tool, it is not designed and developed for such use case. Ansible will just connect to the remote device via SSH and perform a configuration task.
Further Documentation
Ansible concepts
Ansible playbooks
We have a clustered Ansible tower setup comprising of 3 servers. I want to run a patching job on all the 3 servers at a time it only runs on one server(which tower figures out based on internal algo). How do I make it run on 3 servers in cluster at the same time?
You're referring to a "Sliced Job" (or a "Distributed Job"):
https://docs.ansible.com/ansible-tower/3.6.2/html/installandreference/glossary.html#term-distributed-job
The Ansible Tower documentation goes over "Job Slicing" in section 17 (for Tower version 3.6.2):
https://docs.ansible.com/ansible-tower/latest/html/userguide/job_slices.html
Please note this warning if you are trying to orchestrate a complex set of tasks across all the hosts:
Any job that intends to orchestrate across hosts
(rather than just applying changes to individual
hosts) should not be configured as a slice job.
Any job that does, may fail, and Tower will not
attempt to discover or account for playbooks that
fail when run as slice jobs.
Job slicing is useful when you have to re-configure a lot of machines and want to use all of your Tower cluster capacity. Using it for patching 100's of machines would be a good example, the only limits would be to ensure the source of the patches (the local patch server, or your Internet connect) have sufficient capacity to handle the load, and that any reboots of the systems are co-ordinated correctly in spite of the job slicing.
Suppose after running a playbook, if someone tried to change the configuration on one/many of the node(s) managed by Ansible. So how does Ansible come to know that one/many of his managed node(s) is/are out of sync and sync it properly to desired state.
I presume we have this in other automation platforms like Chef and Puppet in which the remote agent runs periodically to be in sync with the master server template.
Also what are the best practices to do so.
Ansible doesn't manage anything by itself. It is a tool to automate tasks.
And it is agentless, so no way to get state updates from remote hosts by their will.
You may want to read about Ansible Tower. Excerpt from features list:
Set up occasional tasks like nightly backups, periodic configuration remediation for compliance, or a full continuous delivery pipeline with just a few clicks.
I want to setup 2 droplets at digital ocean, and I'm thinking about use Vagrant to handle the configuration.
It looks like a good way to go, once digital ocean provides both the box and the "runtime"/provider environment.
I was thinking about having a staging droplet/env where I would use chef to install tools like nginx, ruby, etc.
When vagrant provision/recipes works ok, I would like vagrant to run the provision again, but now targeting my production droplet/env.
How can I achieve this behavior? Is it possible? Do I need to have multiple folders in my local machine? (e.g, ~/vagrant/stage and ~/vagrant/production)
Thank you.
You may want to revisit your actual deployment use case, I doubt you want to unconditionally provision & deploy both the staging & production droplets at the same time.
If you'd like to provision a Digital Ocean droplet to use as your development environment, there is a provider located here
A more common strategy would be to provision your environment locally (using Ansible, Chef etc) and then use vagrant push as a way to create an environment specific deployment i.e vagrant push staging provisions and deploys against all host marked as a staging server. Inventories within Ansible cover one way to describe this separation.
My company has thousands of server instances running application code - some instances run databases, others are serving web apps, still others run APIs or Hadoop jobs. All servers run Linux.
In this cloud, developers typically want to do one of two things to an instance:
Upgrade the version of the application running on that instance. Typically this involves a) tagging the code in the relevant subversion repository, b) building an RPM from that tag, and c) installing that RPM on the relevant application server. Note that this operation would touch four instances: the SVN server, the build host (where the build occurs), the YUM host (where the RPM is stored), and the instance running the application.
Today, a rollout of a new application version might be to 500 instances.
Run an arbitrary script on the instance. The script can be written in any language provided the interpreter exists on that instance. E.g. The UI developer wants to run his "check_memory.php" script which does x, y, z on the 10 UI instances and then restarts the webserver if some conditions are met.
What tools should I look at to help build this system? I've seen Celery and Resque and delayed_job, but they seem like they're built for moving through a lot of tasks. This system is under much less load - maybe on a big day a thousand hundred upgrade jobs might run, and a couple hundred executions of arbitrary scripts. Also, they don't support tasks written in any language.
How should the central "job processor" communicate with the instances? SSH, message queues (which one), something else?
Thank you for your help.
NOTE: this cloud is proprietary, so EC2 tools are not an option.
I can think of two approaches:
Set up password-less SSH on the servers, have a file that contains the list of all machines in the cluster, and run your scripts directly using SSH. For example: ssh user#foo.com "ls -la". This is the same approach used by Hadoop's cluster startup and shutdown scripts. If you want to assign tasks dynamically, you can pick nodes at random.
Use something like Torque or Sun Grid Engine to manage your cluster.
The package installation can be wrapped inside a script, so you just need to solve the second problem, and use that solution to solve the first one :)