HDP host registration failure - hortonworks-data-platform

On ‘Select Stack’ page under ‘Advanced Repository Options’, I checked only ‘redhat6′ which shows ‘400:Bad request’ for HDP and HDP Utils Then I checked ‘Skip Repository Base URL validation’ and proceeded.
Then I added the hostnames and the id_rsa file(of the host where Ambari is running and will also be used as NN) and clicked on next.
3.Three hosts(non-Ambari) failed earlier than the other one, following is the log for one of those
==========================
Creating target directory…
==========================
Command start time 2015-02-11 16:03:55
Connection to l1033lab.sss.se.scania.com closed.
SSH command execution finished
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:56
==========================
Copying common functions script…
==========================
Command start time 2015-02-11 16:03:56
scp /usr/lib/python2.6/site-packages/ambari_commons
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:56
==========================
Copying OS type check script…
==========================
Command start time 2015-02-11 16:03:56
scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:56
==========================
Running OS type check…
==========================
Command start time 2015-02-11 16:03:56
Cluster primary/cluster OS type is redhat6 and local/current OS type is redhat6
Connection to l1033lab.sss.se.scania.com closed.
SSH command execution finished
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:57
==========================
Checking ‘sudo’ package on remote host…
==========================
Command start time 2015-02-11 16:03:57
sudo-1.8.6p3-12.el6.x86_64
Connection to l1033lab.sss.se.scania.com closed.
SSH command execution finished
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:58
==========================
Copying repo file to ‘tmp’ folder…
==========================
Command start time 2015-02-11 16:03:58
scp /etc/yum.repos.d/ambari.repo
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:58
==========================
Moving file to repo dir…
==========================
Command start time 2015-02-11 16:03:58
Connection to l1033lab.sss.se.scania.com closed.
SSH command execution finished
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:58
==========================
Copying setup script file…
==========================
Command start time 2015-02-11 16:03:58
scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=l1033lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:59
==========================
Running setup agent script…
==========================
Command start time 2015-02-11 16:03:59
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: [Errno 12] Timeout on http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: (28, ‘connect() timed out!’)
Trying other mirror.
Error: Cannot retrieve repository metadata (repomd.xml) for repository: Updates-ambari-1.7.0. Please verify its path and try again
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: [Errno 12] Timeout on http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: (28, ‘connect() timed out!’)
Trying other mirror.
Error: Cannot retrieve repository metadata (repomd.xml) for repository: Updates-ambari-1.7.0. Please verify its path and try again
/bin/sh: /usr/sbin/ambari-agent: No such file or directory
{‘exitstatus': 1, ‘log': (”, None)}
Connection to l1033lab.sss.se.scania.com closed.
SSH command execution finished
host=l1033lab.sss.se.scania.com, exitcode=1
Command end time 2015-02-11 16:05:00
ERROR: Bootstrap of host l1033lab.sss.se.scania.com fails because previous action finished with non-zero exit code (1)
ERROR MESSAGE: tcgetattr: Invalid argument
Connection to l1033lab.sss.se.scania.com closed.
STDOUT: This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: [Errno 12] Timeout on http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: (28, ‘connect() timed out!’)
Trying other mirror.
Error: Cannot retrieve repository metadata (repomd.xml) for repository: Updates-ambari-1.7.0. Please verify its path and try again
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: [Errno 12] Timeout on http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.7.0/repodata/repomd.xml: (28, ‘connect() timed out!’)
Trying other mirror.
Error: Cannot retrieve repository metadata (repomd.xml) for repository: Updates-ambari-1.7.0. Please verify its path and try again
/bin/sh: /usr/sbin/ambari-agent: No such file or directory
{‘exitstatus': 1, ‘log': (”, None)}
Connection to l1033lab.sss.se.scania.com closed.
The last one to failed(where Ambari runs) had the following log
==========================
Creating target directory…
==========================
Command start time 2015-02-11 16:03:55
Connection to l1032lab.sss.se.scania.com closed.
SSH command execution finished
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:56
==========================
Copying common functions script…
==========================
Command start time 2015-02-11 16:03:56
scp /usr/lib/python2.6/site-packages/ambari_commons
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:56
==========================
Copying OS type check script…
==========================
Command start time 2015-02-11 16:03:56
scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:56
==========================
Running OS type check…
==========================
Command start time 2015-02-11 16:03:56
Cluster primary/cluster OS type is redhat6 and local/current OS type is redhat6
Connection to l1032lab.sss.se.scania.com closed.
SSH command execution finished
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:57
==========================
Checking ‘sudo’ package on remote host…
==========================
Command start time 2015-02-11 16:03:57
sudo-1.8.6p3-12.el6.x86_64
Connection to l1032lab.sss.se.scania.com closed.
SSH command execution finished
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:58
==========================
Copying repo file to ‘tmp’ folder…
==========================
Command start time 2015-02-11 16:03:58
scp /etc/yum.repos.d/ambari.repo
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:58
==========================
Moving file to repo dir…
==========================
Command start time 2015-02-11 16:03:58
Connection to l1032lab.sss.se.scania.com closed.
SSH command execution finished
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:58
==========================
Copying setup script file…
==========================
Command start time 2015-02-11 16:03:58
scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=l1032lab.sss.se.scania.com, exitcode=0
Command end time 2015-02-11 16:03:59
==========================
Running setup agent script…
==========================
Command start time 2015-02-11 16:03:59
Automatic Agent registration timed out (timeout = 300 seconds). Check your network connectivity and retry registration, or use manual agent registration.
The machines are having Internet access so I presume that there is no need for configuring local repositories. Are there some steps mandatory before one can install Ambari and proceed ?

After spending plenty of time, I assumed that despite of having Internet connectivity, the local repositories will be needed. I installed Apache server and made my repositories accessible as per the documentation. Then, in ‘Advanced Repository Options’, replaced the web url with the local repository URL and it registered the hosts.
I'm still not sure why local repos. are needed(even the documentation mentions that those are needed only in case of limited or no Internet connectivity)

Related

Jenkins bash scripts in temp folder not executing

I have a simple code as script file
#!/bin/bash
name1=$1
name2=$2
echo " $name1 $name2"
when i run in putty connected to the centos system inside the jenkins contained, it is giving output.
But when i try it through jenkins build, It is saying failure as the file is not found.
Started by user Jenkins Admin
Running as SYSTEM
Building in workspace /var/jenkins_home/workspace/my_first_job
[my_first_job] $ /bin/sh -xe /tmp/jenkins5284289512384758361.sh
+ Name=Madan
+ /tmp/1.sh Madan
/tmp/jenkins5284289512384758361.sh: 3: /tmp/jenkins5284289512384758361.sh: /tmp/1.sh: not found
Build step 'Execute shell' marked build as failure
Finished: FAILURE
This is the error.

Error while Provisioning oracle through Puppet

I am trying to provision Oracle through puppet. its failing while running root.sh scripts.
reason what i can see is :
/oracle/product/12.1/db/root.sh: line 13: /oracle/product/12.1/db/rdbms/install/rootadd_rdbms.sh: No such file or directory
Error: /Stage[main]/Install_oradb/Oradb::Installdb[12.1.0.1_Linux-x86-64]/Exec[run root.sh script 12.1.0.1_Linux-x86-64]/returns: change from notrun to 0 failed: Check /oracle/product/12.1/db/install/root_itest-525400f545bb_2016-07-28_11-32-
49.log for the output of root script.
Skipping everything else then.
Here are the files used for provisioning
linuxamd64_12c_database_1of2.zip
linuxamd64_12c_database_2of2.zip

Is there a CLI command to report if vagrant provisioning is complete?

While vagrant up is executing, any call to vagrant status will report that the machine is 'running', even if the provisioning is not yet complete.
Is there a simple command for asking whether the vagrant up call is done and the machine is fully-provisioned?
You could have your provision script write to a networked file and query that. Or you could vagrant ssh -c /check/for/something if there was a file or service to check agains. Your provision script could also ping out to a listener you set up.
You could also use the Vagrant log or debug output to check when provisioning is done.

How to execute sh file from jenkins running on windows

Cygwin installed in my windows and able to execute the sh file using cmd prompt.
same Cygwin plugin has been installed in jenkins which also running in windows.
I created a job in jenkins build step->execute shell command I am giving the command as sh /cygdrive/d/539707/data/getchanges/gymBuild.sh while executing the job I am facing below exception.
NOTE 1: In Jenkins/configuration/ under shell I didn't mention any path
workspace] $ sh -xe
D:\539707\tomcat-7.0.12\temp\hudson4624102689815543789.sh FATAL:
command execution failed java.io.IOException: Cannot run program "sh"
(in directory "C:\Users\539707.jenkins\jobs\Test_Gym\workspace"):
CreateProcess error=2, The system cannot find the file specified at
java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at
hudson.Proc$LocalProc.(Proc.java:244) at
hudson.Proc$LocalProc.(Proc.java:216) at
hudson.Launcher$LocalLauncher.launch(Launcher.java:815) at
hudson.plugins.cygpath.CygpathLauncherDecorator$1.launch(CygpathLauncherDecorator.java:66)
at hudson.Launcher$ProcStarter.start(Launcher.java:381) at
hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:95)
at
hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:64)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:782)
at hudson.model.Build$BuildExecution.build(Build.java:205) at
hudson.model.Build$BuildExecution.doRun(Build.java:162) at
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
at hudson.model.Run.execute(Run.java:1738) at
hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at
hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:410) Caused by:
java.io.IOException: CreateProcess error=2, The system cannot find the
file specified at java.lang.ProcessImpl.create(Native Method) at
java.lang.ProcessImpl.(ProcessImpl.java:385) at
java.lang.ProcessImpl.start(ProcessImpl.java:136) at
java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) ... 16 more
Build step 'Execute shell' marked build as failure
Finished: FAILURE
NOTE 2: In Jenkins/configuration/ under shell I mention C:\cygwin\bin\mintty.exe
After that below is the output
$ C:\cygwin\bin\cygpath -w C:\cygwin\bin\mintty.exe [workspace] $
C:\cygwin\bin\mintty.exe -xe
D:\539707\tomcat-7.0.12\temp\hudson4745164988293910592.sh
/usr/bin/mintty: unknown option '-x' Try '--help' for more
information. Build step 'Execute shell' marked build as failure
Finished: FAILURE
Kindly suggest How to execute sh file from jenkins running on windows
For each windows slave you can do the following to add cygwin to path assuming that your slave has cygwin installed already
Jenkins - Manage Jenkins - Manage Nodes
node - Configure
Environment variables : list of key-value pairs
add: name: PATH value: ${PATH};path-to-cygwin\bin
E.g., name: PATH value: ${PATH};d:\tools\cygwin\bin
here is the solution: Shell executable path to cygwin_home\bin\sh and in jenkins build step->execute shell command give the file name ex *.sh or clear Shell executable path and in jenkins build step->execute windows batch command sh path*.sh

Google Cloud Engine : LibSnappy not installed errur during command-line installation of Hadoop

I'm trying to install a custom Hadoop implementation (>2.0) on Google Compute Engine using the command line option. The modified parameters of my bdutil_env.sh file are as follows:
GCE_IMAGE='ubuntu-14-04'
GCE_MACHINE_TYPE='n1-standard-1'
GCE_ZONE='us-central1-a'
DEFAULT_FS='hdfs'
HADOOP_TARBALL_URI='gs://<mybucket>/<my_hadoop_tar.gz>'
The ./bdutil deploy fails with a exit code 1. I find the following errors in the resultant debug.info file:
ssh: connect to host 130.211.161.181 port 22: Connection refused
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
ssh: connect to host 104.197.63.39 port 22: Connection refused
ssh: connect to host 104.197.7.106 port 22: Connection refused
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
.....
.....
Connection to 104.197.7.106 closed.
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [123].
Connection to 104.197.63.39 closed.
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [123].
Connection to 130.211.161.181 closed.
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [123].
...
...
hadoop-w-1: ==> deploy-core-setup_deploy.stderr <==
....
....
hadoop-w-1: dpkg-query: package 'libsnappy1' is not installed and no information is available
hadoop-w-1: Use dpkg --info (= dpkg-deb --info) to examine archive files,
hadoop-w-1: and dpkg --contents (= dpkg-deb --contents) to list their contents.
hadoop-w-1: dpkg-preconfigure: unable to re-open stdin: No such file or directory
hadoop-w-1: dpkg-query: package 'libsnappy-dev' is not installed and no information is available
hadoop-w-1: Use dpkg --info (= dpkg-deb --info) to examine archive files,
hadoop-w-1: and dpkg --contents (= dpkg-deb --contents) to list their contents.
hadoop-w-1: dpkg-preconfigure: unable to re-open stdin: No such file or directory
hadoop-w-1: ./hadoop-env-setup.sh: line 612: Package:: command not found
....
....
hadoop-w-1: find: `/home/hadoop/hadoop-install/lib': No such file or directory
I don't understand why the initial ssh error is given; I can see the VMs and login to them properly from the UI; my tar.gz is also copied in the proper places.
I also do not understand why libsnappy wasn't installed; is there anything particular I need to do? The shell scripts seem to be having commands to install it, but it's failing somehow.
I checked all the VMs; Hadoop is not up.
EDIT : For solving the ssh problem, I ran the following command:
gcutil --project= addfirewall --allowed=tcp:22 default-ssh
It made no difference.
In this case, the ssh and libsnappy errors are red herrings; when the VMs weren't immediately SSH-able, bdutil polled for awhile until it should've printed out something like:
...Thu May 14 16:52:23 PDT 2015: Waiting on async 'wait_for_ssh' jobs to finish. Might take a while...
...
Thu May 14 16:52:33 PDT 2015: Instances all ssh-able
Likewise, the libsnappy error you saw was a red herring because it's coming from a call to dpkg -s trying to determine whether a package is indeed installed, and if not, to apt-get install it: https://github.com/GoogleCloudPlatform/bdutil/blob/master/libexec/bdutil_helpers.sh#L163
We'll work on cleaning up these error messages since they can be misleading. In the meantime, the main issue here is that Ubuntu hasn't historically been one of the supported images for bdutil; we thoroughly validate CentOS and Debian images, but not Ubuntu images, since they were only added as GCE options in November 2014. Your deployment should work fine with your custom tarball for any debian-7 or centos-6 image. We've filed an issue on GitHub to track Ubuntu support for bdutil: https://github.com/GoogleCloudPlatform/bdutil/issues/29
EDIT: The issue has been resolved with Ubuntu now supported at head in the master repository; you can download at this most recent commit here.
Looking at your error code, it seems like you have to download snappy libraries in your classpath. If you are using java then you can download your libraries from this path https://github.com/xerial/snappy-java. OR try this link https://code.google.com/p/snappy/.

Resources