Knife ec2: need to avoid re-boostraping of server after hostname change - amazon-ec2

I might be doing something wrong, but here is the situation. Standalone Chef server 12.3.0. CentOS 6.3 running on AWS.
During execution of knife bootstrap I am applying hostname:default recipe to change server's FQDN along with some other recipes. Everything iseems to be fine. Chef server shows newly boostrapped instance, but Node Name column is still showing old FQDN, smth like ip-x-x-x-x.aws-region-name.compute.internal.
Then I try to ssh this host and run chef-client I am getting following error:
[ec2-user#newHostName ~]$ sudo chef-client
Starting Chef Client, version 12.3.0
Chef encountered an error attempting to load the node data for "newHostName"
Authentication Error:
----------------
Failed to authenticate to the chef server (http 401).
Server Response:
----------------
Failed to authenticate as 'newHostName'. Ensure that your node_name and client key are correct.
Relevant Config Settings:
-------------------------
chef_server_url "https://chefServerDomain/organizations/organizationName"
node_name "newHostName"
client_key "/etc/chef/client.pem"
If these settings are correct, your client_key may be invalid, or
you may have a chef user with the same client name as this node.
[2015-05-04T12:36:03-07:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
Chef Client failed. 0 resources updated in 0.962848623 seconds
[2015-05-04T12:36:03-07:00] ERROR: 401 "Unauthorized"
[2015-05-04T12:36:03-07:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
I have checked closed issue #8 on GitHub, according to which I need manually change client.rb file and include node_name parameter. At the same time Chef client.rb documentation indicates that I should not do that :
node_name is used to determine which configuration should be applied
and to set the client_name (which is the name used when
authenticating to a Chef server). The default value is set
automatically to be the FQDN of the chef-client, as detected by
Ohai. In general, leaving this setting blank and letting Ohai
assign the FQDN of the node as the node_name during each chef-client
run is the recommended approach.
After cleaning up /etc/chef/* folder, removing this instance from Chef server and re-bootstrapping EC2 instance again I was able to make it work. FQDN was displayed correctly in Chef server under Node Name column as newServerName.
Could you please advise, what should I do to avoid double bootsrapping?

pass the node name you want the node to use into with "-N hostname" to the bootstrap command. Then it will register properly with the final node name.

Related

Unable to bootstrap node on windows node using chef and knife

I've exhausted my options in trying to bootstrap a windows node running in azure. I have the workstation connected to my self-hosted chef server without any issues. I run the bootstrap command and get the following:
Creating new client for vm1
Creating new node for vm1
Connecting to 104.***.***.***
ERROR: Net::SSH::ConnectionTimeout: Net::SSH::ConnectionTimeout
I know the username and password are valid as well as the IP of the target node. What are my options here for debugging such a problem? I believe the necessary ports are open, unless I'm missing something special. I have telnet enabled. Does anyone have any better ideas?
To copy down from the comments, to bootstrap over WinRM you need the knife bootstrap windows winrm command.
you can also bootstrap windows machine with following command. Core Chef now supports bootstrapping Windows systems without a knife plugin
sudo knife bootstrap -o winrm <pubic_IPV4_Address/DNS_of_client_machine> -U Administrator -P '<pwd>' --node-name <node_name> --run-list 'recipe[<cookbook_name>]'
where,
pubic_IPV4_IP/DNS_of_client_machine --> Public IP address/ DNS of the client machine.
node_name --> String representing node name.
cookbook_name --> Cookbook that we want to execute on client machine.
pwd --> password to connect with windows client machine
Note: Make sure to execute above command from ~/chef-repo/.chef/ directory.
If you are unable to execute above command with -o winrm option then install below gem packages
chef gem install winrm
chef gem install knife-windows

Setting up a three tier environment in puppet

These are my files:
Nodes.pp file
site.pp file
I need to setup the infrastructure in the diagram, and I would like to use Puppet Automation in order to do so. I would need to, 
Create 4 VMs, one for DB, 1 web server, 1 load balancer, 1 master
Set them up with Puppet Agent
Find the appropriate modules/cookbooks from the community site
(Puppet Forge/ Chef Supermarket)
Configure the nodes using recipes/classes fetched from the community
sites.
Provide configuration parameters in order to have all these nodes
connect to each other.
 
End goal is to have a working Wordpress setup.
I got stuck with the master agent configuration process. I have a Puppet master and 3 agents up and running. But, but whenever I run #puppet agent --test in the agent, It throws an error. I look forward to the community's help.
The error I am getting is...
[root#agent1 vagrant]# puppet agent --noop --test
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
First take a look at the puppet master logs.
Second: The error message is to short. There is missing something after the
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could The text after the "Could" can be helpful ;)

How can I get a Vagrant provision shell script to successfully clone a mercurial repository?

I am trying to configure a virtual machine using Vagrant as a local development environment.
I would like the provision script to clone a Mercurial repository from Bitbucket within the guest machine.
In order to accomplish this I have added hg clone ssh://hg#bitbucket.org/myuser/myrepo in my provision shell script
I also added the flag config.ssh.forward_agent = true in my Vagrantfile.
However I keep getting the following error:
==> default: remote: Host key verification failed.
==> default: abort: no suitable response from remote hg!
The SSH command responded with a non-zero exit status. Vagrant assumes that
this means the command failed. The output for this command should be
in the log above. Please read the output to determine what went wrong.
I've successfully set up SSH keys for my Bitbucket account on my host machine but for some reason they don't seem to be forwarded in the guest machine even though I've set the flag true.
My host machine is running on Windows 7 and the guest is running on Linux Trusty Tarh
Sincerely appreciate any tips on how I can accomplish this.
You need to add the host key of bitbucket to your known_hosts file. You could either do that manually, or try something like https://serverfault.com/questions/132970/can-i-automatically-add-a-new-host-to-known-hosts.
Hope that helps.

Hadoop Ambari cannot confirm hosts

I tried to use Ambari to manage the installation and maintenance of the Hadoop cluster.
After I started ambari server, I use the web page to set up Hadoop cluster.
But at the 3rd step-- confirm hosts, the error shows below
And I check the log at /var/log/ambari-server, I found:
INFO:root:BootStrapping hosts ['qiao'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: redhat6 with user 'root' sshKey File /var/run/ambari-server/bootstrap/1/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/1 ambari: master; server_port: 8080; ambari version: 1.4.1.25
INFO:root:Executing parallel bootstrap
ERROR:root:ERROR: Bootstrap of host qiao fails because previous action finished with non-zero exit code (1)
INFO:root:Finished parallel bootstrap
Do you provide ssh rsa private key or paste it?
and from the place you are installing, make sure you can ssh to any hosts without typing any password.
If still the same error, try
ambari-server reset
ambari-server setup
Pls restart ambari-server
ambari-server restart
and then try accessing Ambari
It would work.
Make sure you can ssh to every single host on the list, including all master hosts.
To do this, ensure that Ambari host's .ssh/id_rsa.pub entry is included in every hosts' .ssh/authorized_keys file. Then ssh from Ambari's host to every single server - and check if it is asking for your password. You can use a tutorial like http://www.tecmint.com/ssh-passwordless-login-using-ssh-keygen-in-5-easy-steps/ to check if everything has been done properly.
You need to do the same on the Ambari host itself, if you added it to hosts list.

Install Chef-Server 11 on EC2 Instance

I am using hosted Chef for quite some time. Wanted to explore the opensource chef server. hence I am trying to setup my Chef-Server 11 on EC2 instance.
I have Chef-server running and I can access the web GUI for the same. I have the chef-workstation configured on another ec2 instance that is also working fine.
Problem: I am not able to upload any cookbook.
I get below error when I try uploading the cookbook:
# knife cookbook upload getting-started
Uploading getting-started [0.4.0]
/opt/chef/embedded/lib/ruby/1.9.1/net/http.rb:763:in `initialize': Connection refused - connect(2) (Errno::ECONNREFUSED)
However, other list commands of knife are working fine.
I did my home work and bumped on below links:
http://www.opscode.com/blog/2013/03/11/chef-11-server-up-and-running/
http://www.curvve.com/blog/servers/2013/script-to-configure-and-set-your-hostname-and-fqdn-on-ec2-instances/
So,
It is mentioned that the chef-server needs a working FQDN to work. I set the my public ec2 host name as the hostname of the server as well as set it up in /etc/hosts. Rebooted the instance. Ran chef-server-ctl reconfigure again. And still facing the same error.
QUESTION: How to figure out the FQDN part of the EC2 instance for chef-server to work? if anyone has set up chef-server successfully on EC2 and was able to upload the cookbooks, then please share your steps for FQDN workout.
I was having a hard time with this but this solution worked!
Edit /etc/chef-server/chef-server.rb and add these lines (create the file if it doesn't exist):
server_name = "THE PUBLIC IP OF YOUR INSTANCE"
api_fqdn server_name
nginx['url'] = "https://#{server_name}"
nginx['server_name'] = server_name
lb['fqdn'] = server_name
bookshelf['vip'] = server_name
I found the solution here
http://sahebjade.blogspot.com/2013/05/check-your-knife-configuration-and.html
This is how i got it working. updated the public DNS name of my ec2 instance (chef-server) in /etc/sysconfig/network and service network restart. Now I am able to upload the cookbooks fine.
Need to think about elastic IP as potential option for my chef-server.
Edit /etc/chef-server/chef-server.rb and add these lines (create the file if it doesn't exist):
bookshelf["vip"] = node["ipaddress"]
bookshelf["url"] = "https://#{node["ipaddress"]}"
erchef['s3_url_ttl'] = 3600
The first two lines will point your chef-server URL to the machine's IP and the third will solve a timeout issue that apparently always exist when the Chef Server is on EC2.
I wanted to expand some on the answers since they don't give a complete picture. This applies to Chef 11 (hopefully Chef 12 is smarter)
In my case I rolled a master up under VPC #1 which gave it an internal address like this
ip-10-0-0-10.ec2.internal
Because I was only playing with the VPC initially, I had misconfigured some things I needed so I had to drop it and I created a new scheme. Thankfully, I was able to snapshot the old Chef master and bring it up under the new VPC but I found that I couldn't log into Chef anymore. It took some digging but I found in my /var/log/chef-server/chef-server-webui/current log that the install had glommed onto the old hostname and set that as the internal URL for... everything. This caused problems after the internal hostname change
2014-12-24_16:19:09.46680 SocketError: Error connecting to https://ip-10-0-0-10.ec2.internal/users/admin - getaddrinfo: Name or service not known
Now, to the OP answer
Need to think about elastic IP as potential option for my chef-server
In my case, I just added a CNAME to CloudFlare and set that as my permanent address. Since I can set CloudFlare to a low TTL on that one address it makes it easy to move it around between IP changes (I don't need an Elastic IP while I'm just getting it configured). This way I could then tell Chef to always look for the same URL and not worry about an EIP.
Once that was done, I had to update Chef. I don't know what changed (this is 11.16.4) but I found the configs live in /var/opt/chef-server/chef-server-webui/etc/chefserver.rb as opposed to some of the other answers listing chef-server.rb. Not sure if that's a YMMV thing or not.
I changed the following towards the bottom of that file
# Environment specific application configuration.
# These values override the ones set in 'RAILS_ROOT/config/application.rb'
#config.chef_server_url = "https://ip-10-0-0-10.ec2.internal"
config.chef_server_url = "https://chef.mydomain.com"
I also changed /var/opt/chef-server/nginx/etc/chef_https_lb.conf
server_name chef.mydomain.com;
Finally I restarted Chef
chef-server-ctl restart
That seems to have done the trick. Logins work again.

Resources