I am trying to automate bootstrapping a new node for instance an EC2 instance into Ansible as a slave node. I have seen some solutions like copying public keys using user-data. Can anyone suggest a more solid approach with an example on how I can achieve this? Thanks in advance
Ansible has two types of nodes
Master / Control Machines : The node from which ansible is invoked
Client / Remote Machine : The nodes on which ansible operates
Ansible's primary mode of transport is ssh, using which it applies the playbook to the remote machine. In order for ansible to ssh into the remote machine, ssh has to be already setup between the machine using private/public key authentication preferably.
When it comes to EC2:
1) Every node has a default key with which the instance is launched, and ansible could use this key to ssh, but this considered insecure / not a best pratice.
2) A key has to present in the client node, with which ansible can authenticate successfully, and the most preferred way is to pull the keys using user-data from a restricted S3 bucket.
Related
I have the following configuration in Alibaba ECS:
Public Connector and Three Test Nodes
Connector has network connections on the public internet and the default VSwitch in the default VPC. Connector was created using the ECS web interface. The testnode[0-2] machines were created in a script using the Alibaba cli command: aliyun.
When the instances start running, the connector can ping none of them. If I set a password on any of the test nodes, and then restart the test node, ping starts working. The script uses a snapshot of the Connector as the image for the test nodes. The ```Connector`` has a randomly generated, long, and forgotten root password. Root access is via ssh with a passphrase protected key pair. It also has the same for a non-root user for the test code.
What I have tried is creating test nodes with the following CreateInstance options:
No --Password and no --InheritPassword options (original intent: why set a password? I have the access I need from the Connector image)
--InheritPassword option (I need a root password in order for the private network interfaces to work, the root password in the Connector image is fine)
--Password option (I need to explicitly set a root password on the test nodes)
The result is all the same, until I use the ECS web interface to set a password and restart a test node, Console cannot ping the test nodes.
What I know:
This is not a problem with the default security group, VPC, or VSwitch as I touch no settings on these entities in order for ping to work.
This is not a problem with the instance image because as soon as ping works, ssh to the test nodes works as well.
What I am doing wrong, or what am I missing? The whole purpose is to spin up instances without having to type away at the ECS web interface. I figured out what it took to get the private network traffic moving because I wanted to debug the situation on the test nodes, and for that, I had to set a root password and gain access from the ECS web console, which again, defeats the purpose of scripting.
Aliyun command for creating the test nodes:
aliyun ecs CreateInstance --ImageId m-2vchb2oxldfuloh51wp9 --RegionId=cn-chengdu --InstanceType=ecs.c6.xlarge --SpotStrategy SpotWithPriceLimit --SpotPriceLimit 0.25 --ZoneId cn-chengdu-a --InternetChargeType PayByTraffic --InternetMaxBandwidthOut 99 --InstanceName TEST_NODE-0 --HostName testnode0 --Password 'notgoingtotellyou'
Operating system for all instances is Ubuntu 18.0.4.
Aliyun command version is 3.0.30.
I got two answers. One from a co-worker. One from Alibaba.
Co-worker's answer:
The configuration fails because the Unbuntu 18.0.4 image that I created for the non-public test machines used a static address for the internal network interface. I changed the internal network interface (eth0) to use dhcp and all worked. See netplan configuration examples for how to change the IP address assignment.
Alibaba's answer:
Try using aliyun ecs RunInstances instead of three individual aliyun ecs CreateInstance and aliyun ecs StartInstance invocations. I did not try this solution as it would have involved rewriting my scripts. Alibaba could have done more to motivate me by providing an explanation as to why RunInstances would produce a different result than the combination of CreateInstance and StartInstance.
I am creating a cloudera cluster and when I do SSH from one node to another I get message saying "Public Key".Login to the machines happen using PEM file and paraphrase Is it necessary to be able to do SSH from one node to another in order to do cloudera cluster installation?
Yes, it is necessary. Cloudera manager need a way to access other machines over SSH. This can be authenticated using public keys (recommended) or passwords.
Cloudera Manager requires an SSH key to communicate with the Cloudera SCM Agents and do the installation.
Cloudera Manager installs the cluster for you. but you will need to initially add the authorized_keys SSH file for the remote manager to access the agent machines.
How does Cloudera Manager Work? (doesn't mention SSH, though)
I would like to add nodes to a certain vault before creating them, for example:
All vagrant machines that I provision with vagrant up that has the patter vagrant-dev-* could acces the chef vault secrets.
If i try to do this, I've got a warning that no one machine is on chef with that pattern.
WARNING: No clients were returned from search, you may not have got what you expected!!
If I try the command after the machine is provisioned it works, but then the provision fails because the machine does not have acces to the vault for configure the sensitive information.
knife vault create secrets root -M client -S "name:vagrant-dev-*"
How can I make the machines have access to the vaul before provisioning them?
Unfortunately this is not possible. For something to be added to a vault it needs to have an RSA public key available on the Chef Server. This is generally done as part of the node bootstrap and client creation. This is a structural limitation of this whole category of asymmetric pre-encryption systems, the keys for all secrets consumers must be known at the time of the pre-encryption process.
My application consists of two components: server and agent.
I am using Vagrant to create two EC2 instances. One for each application component.
First, I create the server instance and provision it with Chef.
Now, I am trying to provision the agent instance. To do so, I need to download the agent deployment package from the server instance to the agent instance.
The best way is to use scp with the server instance private ip.
How can I share the server private ip address with the agent provision process (Chef)?
Seems the vagrant host manager plugin (https://github.com/smdahlen/vagrant-hostmanager) could help.
From the documentation
You can customize way, how host manager resolves IP address for each
machine. This might be handy in case of aws provider, where host name
is stored in ssh_info hash of each machine. This causes generation of
invalid /etc/hosts file.
Custom IP resolver gives you oportunity to calculate IP address for
each machine by yourself, giving You also access to the machine that
is updating /etc/hosts. For example:
config.hostmanager.ip_resolver = proc do |vm, resolving_vm|
if hostname = (vm.ssh_info && vm.ssh_info[:host])
`host #{hostname}`.split("\n").last[/(\d+\.\d+\.\d+\.\d+)/, 1]
end
end
My goal is to launch 200 instance of windows node of the same ami in aws. These node come up and connect to my head node. Now, every launch of a new node create a new password for that node. This is hard to manage specially if I want to do group remote maitenance.
I was thinking, maybe I can make all of specific ami to have the same password but do I do that ? Should I modify sysprep condfig file C:\Program Files\Amazon\Ec2ConfigService\sysprep2008.xml or should I disable both set password for the ec2 config tool and then create a AMI?
If the config file, what exactly should I put in the sysprep2008.xml file?