I'm learning hadoop and try to set up the environment according to online documents.
I've configured the ssh (that ssh localhost won't need a password), configured the
"core-site.xml", "hdfs-site.xml", "mapred-site.xml" and "yarn-site.xml"
But when I tried "hadoop namenode -format" it gave out "java.net.UnknownHostException" and
host = java.net.UnknownHostException:
I've tried to search online help but nearly all are: change the network configuration in etc/hosts. But I'm using hadoop 2.4 and there's no such folder.
Any suggestion?
Thanks!
Well, they are right, you have to change the /etc/hosts file. I assume you have localhost on your hadoop configuration files, so you need to open /etc/hosts as sudo and add the following line:
127.0.0.1 localhost localhost
I found solution which worked for me.
First of all edit your /etc/hostname, write namenode or alias you want to set for namenode then run sudo ifconfig eth0 down&&sudo ifconfig eth0 up and try again.
Related
I have hadoop installed in pseudo-distributed mode.
When running the command
hadoop fs -ls
I am getting the following error:
ls: Call From kali/127.0.1.1 to localhost:9000 failed on connection exception:
java.net.ConnectException: Connection refused;
For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Any suggestions?
If you read the link in the error, I see two immediate points that need addressed.
If the error message says the remote service is on "127.0.0.1" or "localhost" that means the configuration file is telling the client that the service is on the local server. If your client is trying to talk to a remote system, then your configuration is broken.
You should treat pseudodistributed mode as a remote system, even if it is only running locally.
For HDFS, you can resolve that by putting your computer hostname (preferably the full FQDN for your domain), as the HDFS address in core-site.xml. For your case, hdfs://kali:9000 should be enough
Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
I'm not completely sure why it needs removed, but the general answer I can think of is that Hadoop is a distributed system, and as I mentioned, treat the pseudodistributed mode as if it's remote HDFS server. Therefore, no loopback addresses should use your computers hostname
For example, remove the second line of this
127.0.0.1 localhost
127.0.1.1 kali
Or remove the hostname from this
127.0.0.1 localhost kali
Most importantly (emphasis added)
None of these are Hadoop problems, they are hadoop, host, network and firewall configuration issues
I am trying to configure Hadoop cluster installed using Cent OS 6.x virtual machine,i have configured single node Hadoop cluster first to replicate and form the cluster from that later,but confused on configuration of static ip address for my virtual Hadoop cluster, my ifcfg-eth0 currently look like below,
DEVICE=eth0
TYPE=Ethernetle
UUID=892c57f5-17db-486d-b1b9-97efa8799bf0
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=dhcp
HWADDR=00:0C:29:5C:04:D0
DEFROUTE=yes
PEERDNS=yes
PEERROUTES=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME="System eth0"
can anyone help me to configure static address for my virtual Hadoop cluster, and also i am not able to ping any other host name other than localhost but can able to ping host address,anyone please help me to resolve this ping and static address issues.
A bridge network will help. In virtualbox select machine, settings>network
Also following is an example of the file you are editing.
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
PROTO=static
IPADDR=10.0.1.200
NETMASK=255.255.255.0
There are other requirements. Attached link might help
http://blog.cloudera.com/blog/2014/01/how-to-create-a-simple-hadoop-cluster-with-virtualbox/
For internet access files listed below needs to be modified.
/etc/resolv.conf
#Generated by NetworkManager
nameserver 8.8.8.8
nameserver 8.8.4.4
Also the file which you modified for static IP /etc/sysconfig/network-scripts/ifcfg-eth0, need following additions
DNS1=8.8.8.8
DNS2=8.8.4.4
ONBOOT=yes
I'm using Hadoop MapReduce paradigm, and i need to get the NameNode IP address from the DataNode, can any one give me an idea how to do this?
Thanks.
Easiest way would be to quickly open the core-site.xml file under HADOOP_HOME/conf directory. The value of fs.default.name property will tell you the host and port where NN is running.
Delete the line 127.0.0.1 localhost in your /etc/host and put your IP and the name of all your machines. Hadoop is resolving all the IPs and names of machines on the cluster as 127.0.0.1 localhost if you leave the file as default.
I keep getting INFO ipc.HBaseRPC: Problem connecting to server: /129.10.193.117:49176 when I try to create or disable tables in Hbase. I have googled this error and found couple of answers. All of them say something like "Default installation added a line in /etc/hosts which linked to machine hostname with the IP 127.0.1.1."
Here is my etc/host file
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost
Also my conf/regionservers is pointing to localhost. Does anyone know how to fix this ? After shutting down hbase & Hadoop and then restarting my system, I do not get this error. Not sure why.
Are you hadoop configurations also pointing to localhost ?
I'm setting up Hadoop (0.20.2). For starters, I just want it to run on a single machine - I'll probably need a cluster at some point, but I'll worry about that when I get there. I got it to the point where my client code can connect to the job tracker and start jobs, but there's one problem: the job tracker is only accessible from the same machine that it's running on. I actually did a port scan with nmap, and it shows port 9001 open when scanning from the Hadoop machine, and closed when it's from somewhere else.
I tried this on three machines (one Mac, one Ubuntu, and an Ubuntu VM running in VirtualBox), it's the same. None of them have any firewalls set up, so I'm pretty sure it's a Hadoop problem. Any suggestions?
In your hadoop configuration files, does fs.default.name and mapred.job.tracker refer to localhost?
If so, then Hadoop will only listen to port 9000 and 9001 on the loopback interface, which is inaccessible from any other host. Make sure fs.default.name and mapred.job.tracker refer to your machine's externally accessible host name.
Make sure that you have not double listed your master in the /etc/hosts file.
I had the following which only allowed master to listen on 127.0.1.1
127.0.1.1 hostname master
192.168.x.x hostname master
192.168.x.x slave-1
192.168.x.x slave-2
The above answer caused the problem. I changed my /ect/hosts file to the following to make it work.
127.0.1.1 hostname
192.168.x.x hostname master
192.168.x.x slave-1
192.168.x.x slave-2
Use the command netstat -an | grep :9000 to verify your connections are working!
In addition to the above answer, I found that in /etc/hosts on the master (running ubuntu) had the line:
127.0.1.1 master
Which meant that running nslookup master on the master returned a local address - so in spite of using master in mapred-site.xml I suffered the same problem. My solution (there are probably better ones) was to create an alias in my DNS server and use that instead. I guess you could probably also change the IP address in /etc/hosts to the external one, but I haven't tried this - I'm not sure what implications it would have for other services.