Hadoop 2.6.4 Web UI Time Out - hadoop

I installed Hadoop 2.6.4 on my AWS - 4 instance; 1 namenode; 1 secnamenode; 2 slaves. After the installation is completed, I tried seeing the namenode on Web UI using URL ec2-52-90-242-76.compute-1.amazonaws.com:50070 I am getting timed out.. anybody help??

If you are accessing from your system, you need to update your hosts files with IP address along with hostname or you can open directly with IP_address:50070
As well as check below
Check Firewall is on or off (Recommended is off)
Check Iptables service status (Recommended is stop)
Check SELINUX (Recommended is disables)

Related

RabbitMQ Erlang distribution failed

I have two Windows Server 2012 R2 machines located in one of the client's datacenters. Both servers are domain-joined. They both have RabbitMQ 3.6.0. installed on them. RabbitMQ is running as Windows Service on both machines. I've tried to cluster these two machines for a long time now without success. I always get the following error when I try to cluster them.
One the first machine nodeA I run the command 'rabbitmqctl join_cluster rabbit#nodeB'. This is what I get:
Clustering node 'rabbit#nodeA' with 'rabbit#nodeB' ...
Error: unable to connect to nodes ['rabbit#nodeB']: nodedown
`DIAGNOSTICS`
===========
attempted to contact: ['rabbit#nodeB']
rabbit#nodeB:
* connected to epmd (port 4369) on nodeB
* epmd reports node 'rabbit' running on port 25672
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
* suggestion: is the Erlang distribution using TLS?
current node details:
- node name: 'rabbitmq-cli-3892#nodeA'
- home dir: C:\Users\mydirectory
- cookie hash: l+SSu57+cRyAQ03AJdwAbQ==
I've tried this setup with Azure Virtual Machines within Azure Virtual Network and I succeeded to cluster the two VM's, however it seems I cannot connect these two (customer's machines) together.
This is what I have done and ensured:
There isn't any firewall blocking connections
Added host names to hosts file located on C:\Windows\system32\drivers\etc
Tried to refer to host names as FQDN without adding anything to hosts file
Tried to refer to host names with CAPITAL letters and without
Copied the same exact .erlang.cookie to C:\Windows and C:\Users\mydirectory on both machines.
I've read, understood and applied RabbitMQ Clustering Guide https://www.rabbitmq.com/clustering.html
Stopped, restarted, reinstalled RabbitMQ on both machines.
It seems I can't get it to work. On Azure machines, which were not domain-joined clustering worked beautifully. I am really running out of options... Any help?
i had the same problem you need to install rabbitmq as a admin. uninstall then reinstall as admin and it should work fine
Try to connect to each of RabbitMQ nodes via remote shell and check if value of cookie is the same (cookie can be set in 3 different ways: .erlang.cookie is one of them).
erl -remsh 'rabbitmq-cli-3892#nodeA' -name 'test#nodeA'
erlang:get_cookie().

cloudera host with bad health during install

Trying again & again with all required steps completed but cluster Installation when install selected Parcels, always shows every host with bad health. setup never completed at full.
i am installing cm 5.5 on CentOS 6.7 using virtualbox.
The Error
Host is in bad health cm.feuni.edu
Host is in bad health dn1.feuni.edu
Host is in bad health dn2.feuni.edu
Host is in bad health nn1.feuni.edu
Host is in bad health nn2.feuni.edu
Host is in bad health rm.feuni.edu
above error are shown on step 6 where setup says
The selected parcels are being downloaded and installed on all the hosts in the cluster
in previous step 5 all hosts were completed with heartbeat checks in the end
memory distributions
cm 8GB
all others with 1GB
i could not find proper answer anywhere else. What reason could be for the bad health?
I don't know if it will help you...
For me, after a few days I struggled with it,
I found the log files (at )
It had a comment there is a mismatch of the guid,
so I uninstalled everything from both machines (using the script they give,/usr/share/cmf/uninstall-cloudera-manager.sh , yum remove 'cloudera-manager-*' and deletion of every directory related to cloudera I found...)
and then removed the guid file:
rm /var/lib/cloudera-scm-agent/cm_guid
Afterwards I re-installed everything, and that fixed that issue for me...
I read online that there can be issues with the hostname and things like that, but I guess that if you get to this part of the installation, you already fixed all the domain/FDQN/hosname/hosts issues.
It saddens me there is no real manual/FAQ for this product.. :(
Good luck!
I faced the same problem. This is my solution:
First I edited config.ini
$ nano /etc/cloudera-scm-agent/config.ini
so that the hostname where the same as the command $ hostname returned.
then I restarted the agent and the server of cloudera:
$ service cloudera-scm-agent restart
$ service cloudera-scm-server restart
then in cloudera manager I deleted the cluster and added again. The wizard continued to run normally.

Unable to get Mesos to run from tutorial: Setting up a Single Node Mesosphere Cluster

I have been following this tutorial to try and setup a single node mesosphere cluster from their
official tutorial:
http://mesosphere.com/docs/getting-started/developer/single-node-install/
I followed all the commands without any issues, and I also added the ports 5050 and 8080 to my security group. When I try to access the console for mesos/marathon, I get a "Internet Explorer cannot display the webpage" message.
They also recommend checking it the following way:
MASTER=$(mesos-resolve `cat /etc/mesos/zk`)
mesos-execute --master=$MASTER --name="cluster-test" --command="sleep 5"
But that comes up with an error:
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0106 17:03:08.126703 20993 process.cpp:1561] Failed to initialize, gethostbyname2: Unknown host
*** Check failure stack trace: ***
I am not really sure how to troubleshoot this either, and there are not many tutorials I could find on how to install mesos on ubuntu.
I checked the contents of the zk file, seems to be the default value.
$ cat /etc/mesos/zk
zk://localhost:2181/mesos
I would really appreciate any clues on how to go about this one.
Edit: The process is definitely running too - just an fyi:
root 31545 8.5 5.9 187464 35604 ? Ssl 17:28 0:00 /usr/local/sbin/mesos-slave --master=zk://localhost:2181/mesos --log_dir=/var/log/mesos
root 31563 28.5 2.1 116304 12856 ? Rs 17:28 0:00 /usr/local/sbin/mesos-master --zk=zk://localhost:2181/mesos --port=5050 --log_dir=/var/log/mesos --quorum=1 --wo
Mesos uses gethostbyname2 to resolve hostnames to IPs. The first thing I would recommend, is to try "ping localhost" and "ping hostname", and verify that there are no strange settings in /etc/hosts. If you're doing a multi-node cluster, I'd recommend that hostname map to the public IP address (not 127.0.x.1).
If that doesn't help, you can try setting the --ip and --hostname flags when starting mesos-master and mesos-slave, to bypass the gethostbyname2 resolution. These can also be set by writing to the file-based parameters, e.g. /etc/mesos/mesos-master/ip
For additional troubleshooting, try running wget http://localhost:5050 (or curl -L) from the mesos master, to verify that it is locally visible. Also try wget http://<public_ip>:5050 to verify that the web server is up and serving to the public IP. Depending on how your (EC2?) node is setup, you may need to expose/forward the port, or connect to a VPN.
Thanks Adam. I ran the wget and curl commands, and nothing was actually listening on port 8080 or 5050. I did open those ports in the ec2. A simple reboot did the trick however, once I ssh'ed into the ec2 instance after the reboot, both mesos and marathon were running and both ports are now showing after I ran
netstat -ntln.

CDH 5.1 host IP address change

I have a CDH 5.1 cluster with 3 nodes. We installed it using cloudera manager automated installation.
It was running perfect until we moved the box to a different network and IP addresses changed. I tried following steps
1. Stopped service, cloudera-scm-server.
2. Stopped service, cloudera-scm-agent
3. Edit the /etc/cloudera-scm-agent/config.ini
4. change the server host to the new ip.
5. restart service, cloudera-scm-agent, cloudera-scm-server.
not working .
Then i followed
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-latest/Cloudera-Manager-Administration-Guide/cmag_change_hostnames.html
Not helped even after changing the ips in the PostgreSQL directly.
I found following blog :
http://www.geovanie.me/changing-ip-of-node-in-cdh-cluster/
Getting following error in the scm-agent log file
ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized>
Not working ....
Can anyone please help how to change all IP addresses in a cdh 5.1 cluster safely .....
Thanks,
Amit
This is causing because of precious cloudera-scm-agent service wasn't stopped correctly, please try,
$> ps -ef | grep supervisord
$> kill -9 <processID>
then restart the agent again.
$>service cloudera-scm-agent start

Hadoop Ambari cannot confirm hosts

I tried to use Ambari to manage the installation and maintenance of the Hadoop cluster.
After I started ambari server, I use the web page to set up Hadoop cluster.
But at the 3rd step-- confirm hosts, the error shows below
And I check the log at /var/log/ambari-server, I found:
INFO:root:BootStrapping hosts ['qiao'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: redhat6 with user 'root' sshKey File /var/run/ambari-server/bootstrap/1/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/1 ambari: master; server_port: 8080; ambari version: 1.4.1.25
INFO:root:Executing parallel bootstrap
ERROR:root:ERROR: Bootstrap of host qiao fails because previous action finished with non-zero exit code (1)
INFO:root:Finished parallel bootstrap
Do you provide ssh rsa private key or paste it?
and from the place you are installing, make sure you can ssh to any hosts without typing any password.
If still the same error, try
ambari-server reset
ambari-server setup
Pls restart ambari-server
ambari-server restart
and then try accessing Ambari
It would work.
Make sure you can ssh to every single host on the list, including all master hosts.
To do this, ensure that Ambari host's .ssh/id_rsa.pub entry is included in every hosts' .ssh/authorized_keys file. Then ssh from Ambari's host to every single server - and check if it is asking for your password. You can use a tutorial like http://www.tecmint.com/ssh-passwordless-login-using-ssh-keygen-in-5-easy-steps/ to check if everything has been done properly.
You need to do the same on the Ambari host itself, if you added it to hosts list.

Resources