I'm trying to set up two types of hadoop clusters: one standalone via SSH localhost and the other in aws ec2.
Both fail for similar issues: a connection refused error.
Here are some pictures of the issues: This is the result of ssh localhost
The next is: the failed run.
This is the relevenat portion of ~/.ssh/config
I can run hadoop, hdfs, yarn, and all the other commands. But, when I actually type this and run it, it fails:
Of note, I'm following this tutorial for the aws ec2 cluster, (this command is almost at the end). https://awstip.com/setting-up-multi-node-apache-hadoop-cluster-on-aws-ec2-from-scratch-2e9caa6881bd
Which is failling on this command: scp hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml ubuntu#ec2-54-209-221-47.compute-1.amazonaws.com:/home/ubuntu/hadoop/conf
That's not my ec2 link; it's from the example, but that's where it's faiing with the same error as the 2nd and 4th pictures.
Related
I have hadoop installed in pseudo-distributed mode.
When running the command
hadoop fs -ls
I am getting the following error:
ls: Call From kali/127.0.1.1 to localhost:9000 failed on connection exception:
java.net.ConnectException: Connection refused;
For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Any suggestions?
If you read the link in the error, I see two immediate points that need addressed.
If the error message says the remote service is on "127.0.0.1" or "localhost" that means the configuration file is telling the client that the service is on the local server. If your client is trying to talk to a remote system, then your configuration is broken.
You should treat pseudodistributed mode as a remote system, even if it is only running locally.
For HDFS, you can resolve that by putting your computer hostname (preferably the full FQDN for your domain), as the HDFS address in core-site.xml. For your case, hdfs://kali:9000 should be enough
Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this).
I'm not completely sure why it needs removed, but the general answer I can think of is that Hadoop is a distributed system, and as I mentioned, treat the pseudodistributed mode as if it's remote HDFS server. Therefore, no loopback addresses should use your computers hostname
For example, remove the second line of this
127.0.0.1 localhost
127.0.1.1 kali
Or remove the hostname from this
127.0.0.1 localhost kali
Most importantly (emphasis added)
None of these are Hadoop problems, they are hadoop, host, network and firewall configuration issues
I started following an online tutorial to configure multi ndoes on my single local VM. here is the hosts on master node:
127.0.0.1 localhost
192.168.96.132 hadoop
192.168.96.135 hadoop1
192.168.96.136 hadoop2
ssh:ALL:allow
sshd:ALL:allow
Here is the command that used to work:hdfs dfs -ls
Now I am seeing error message below:
ls: Call From hadoop/192.168.96.132 to hadoop:9000 failed on connection exception:
java.net.ConnectException: Connection refused;
For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
What is wrong with my configuration? where should I check and correct it?
Thank you very much.
First try to
ping hadoop,
ping hadoop1 and
ping hadoop2.
Ex: ping hadoop
Then just try to connect via ssh
The syntax is
ssh username#hadoop
ssh username#hadoop1
ssh username#hadoop2
Then see the results to find out whether the systems are connecting or not.
I am trying to access a firewalled Hadoop cluster running YARN via a SOCKS proxy. The cluster itself is not using proxied connections -- only my client running on a local machine (e.g. a laptop) is connected via ssh -D 9999 user#gateway-host to a machine that can see the Hadoop cluster.
In the Hadoop configuration core-site.xml (on my laptop) I have the following lines:
<property>
<name>hadoop.socks.server</name>
<value>localhost:9999</value>
</property>
<property>
<name>hadoop.rpc.socket.factory.class.default</name>
<value>org.apache.hadoop.net.SocksSocketFactory</value>
</property>
Accessing HDFS this way works great. However, when I try to submit a YARN job, it fails and I can see in the logs that the nodes are not able to talk to each other:
java.io.IOException: Failed on local exception: java.net.SocketException: Connection refused; Host Details : local host is: "host1"; destination host is: "host2":8030;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
where host1 and host2 are both parts of the hadoop cluster.
I guess what is happening is that the hadoop nodes are trying to communicate via a socks proxy as well and this is obviously failing since no proxy server exists on each host. Is there a way to fix this apart from setting up a dedicated proxy server?
You are right, the Hadoop nodes must not use the SOCKS proxy for the communication. You can achieve that by marking the SocketFactory setting on the cluster side final.
In core-site.xml on the cluster, add the final tag to the default SocketFactory property:
<property>
<name>hadoop.rpc.socket.factory.class.default</name>
<value>org.apache.hadoop.net.StandardSocketFactory</value>
<final>true</final>
</property>
Obviously, you must restart cluster services.
I have a container running hadoop. I have another docker file which contains Map-Reduce job commands like creating input directory, processing a default example, displaying output. Base image for the second file is hadoop_image created from first docker file.
EDIT
Dockerfile - for hadoop
#base image is ubuntu:precise
#cdh installation
#hadoop-0.20-conf-pseudo installation
#CMD to start-all.sh
start-all.sh
#start all the services under /etc/init.d/hadoop-*
hadoop base image created from this.
Dockerfile2
#base image is hadoop
#flume-ng and flume-ng agent installation
#conf change
#flume-start.sh
flume-start.sh
#start flume services
I am running both containers separately. It works fine. But if i run
docker run -it flume_service
it starts flume and show me a bash prompt [/bin/bash is the last line of flume-start.sh]. The i execute
hadoop fs -ls /
in the second running container, i am getting the following error
ls: Call From 514fa776649a/172.17.5.188 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I understand i am getting this error because hadoop services are not started yet. But my doubt is my first container is running. I am using this as base image for second container. Then why am i getting this error? Do i need to change anything in hdfs-site.xml file on flume contianer?
Pseudo-Distributed mode installation.
Any suggestions?
Or Do i need to expose any ports and like so? If so, please provide me an example
EDIT 2
iptables -t nat -L -n
I see
sudo iptables -t nat -L -n
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
DOCKER all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE tcp -- 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-6
MASQUERADE udp -- 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-6
MASQUERADE all -- 192.168.122.0/24 !192.168.122.0/24
MASQUERADE all -- 172.17.0.0/16 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-
Chain DOCKER (2 references)
target prot opt source destination
It is in docker#domian. Not inside a container.
EDIT
See last comment under surazj' answer
Have you tried linking the container?
For example, your container named hadoop is running in psedo dist mode. You want to bring up another container that contains flume. You could link the container like
docker run -it --link hadoop:hadoop --name flume ubuntu:14.04 bash
when you get inside the flume container - type env command to see ip and port exposed by hadoop container.
From the flume container you should be able to do something like. (ports on hadoop container should be exposed)
$ hadoop fs -ls hdfs://<hadoop containers IP>:8020/
The error you are getting might be related to some hadoop services not running on flume. do jps to check services running. But I think if you have hadoop classpath setup correctly on flume container, then you can run the above hdfs command (-ls hdfs://:8020/) without starting anything. But if you want
hadoop fs -ls /
to work on flume container, then you need to start hadoop services on flume container also.
On your core-site.xml add dfs.namenode.rpc-address like this so namenode listens to connection from all ip
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>0.0.0.0:8020</value>
</property>
Make sure to restart the namenode and datanode
sudo /etc/init.d/hadoop-hdfs-namenode restart && sudo /etc/init.d/hadoop-hdfs-datanode restart
Then you should be able to do this from your hadoop container without connection error, eg
hadoop fs -ls hdfs://localhost:8020/
hadoop fs -ls hdfs://172.17.0.11:8020/
On the linked container. Type env to see exposed ports by your hadoop container
env
You should see something like
HADOOP_PORT_8020_TCP=tcp://172.17.0.11:8020
Then you can verify the connection from your linked container.
telnet 172.17.0.11 8020
I think I met the same problem yet. I either can't start hadoop namenode and datanode by hadoop command "start-all.sh" in docker1.
That is because it launch namenode and datanode through "hadoop-daemons.sh" but it failed. The real problem is "ssh" is not work in docker.
So, you can do either
(solution 1) :
Replace all terms "daemons.sh" to "daemon.sh" in start-dfs.sh,
than run start-dfs.sh
(solution 2) : do
$HADOOP_PREFIX/sbin/hadoop-daemon.sh start datanode
$HADOOP_PREFIX/sbin/hadoop-daemon.sh start namenode
You can see datanode and namenode are working fine by command "jps"
Regards.
I am getting below error when I am trying to configure hadoop plugin in eclipse.
Error:call to localhost:54310 failed on connection exception:java.net.connectException:Connection refused:no further informaion
Hadoop version is 1.0.4
I have installed hadoop in Linux and I am running my Eclipse using Windows.
In the hadoop location window, I have tried with host as localhost and linux server.
MR Master: Host: localhost and port 54311
DFS Master: Host: localhost and port 54310
MR Master: Host: <Linux server name> and port 54311
DFS Master: Host: <Linux server name> and port 54310
In my mapred-site.xml I see this entry entry localhost:54311.
ConnectionRefused error is, you are trying to connect a directory which you dont have permission to read/write.
This may be caused by, a directory created another user(e.g. root) and your master machine is trying to read from/write to that directory.
It is more likely that you are trying to read an input from wrong place. Check your input directory if there is no problem with it, check your output directory