When I am running the following query in hive:
hive> select count(*) from testsql;
I am getting the following error:
Error
FAILED: RuntimeException java.net.ConnectException: Call From impetus-1466/192.168.49.77 to impetus-1466:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
The jps looks like:
[impadmin#impetus-1466 hadoop-1.0.3.15]$ jps
26380 TaskTracker
26709 Jps
26230 JobTracker
25943 NameNode
I started the
$ start-all.sh
$ start-dfs.sh
$ start-mapred.sh
How could this be solved?
Thanks
If you can open the http://localhost:8088/cluster but can't open http://localhost:50070/. Maybe datanode didn't start-up or namenode didn't formated.
And check hadoop.tmp.dir in core-site.xml, if it is not set, the default directory of it is /tmp, so set hadoop.tmp.dir in core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/path/to/hadoop/tmp</value>
</property>
Then stop hadoop and reformat hdfs namenode -format, then restart the hadoop.
Similar question http://localhost:50070 does not work HADOOP
The reason for this is, either there are no datanodes in your cluster or the datanodes do not know their namenode. This might be the result of namenode format at least twice. The cluster id of namenode got changed but this change was not reflected to the datanodes.
The below links might be helpful:
Datanode not starts correctly
http://hortonworks.com/community/forums/topic/clusterid-mismatch-for-namenode-and-datanodes-in-fully-distributed-cluster/
Related
I installed Hadoop-3.3.4 in Ubuntu-20. I wrote the command for starting hadoop, i.e.
samar#pc:~$ $HADOOP_HOME/sbin/start-all.sh
Then it showed the output as.
WARNING: Attempting to start all Apache Hadoop daemons as samar in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [pc]
Starting resourcemanager
Starting nodemanagers
But when I tried to access the HDFS with the command
samar#pc:~$ hdfs dfs -ls
It gave a message as:
ls: Call From pc/127.0.1.1 to localhost:9000 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
and the output of jps was:
10485 Jps
10101 NodeManager
9946 ResourceManager
9739 SecondaryNameNode
9533 DataNode
Namenode did not start successfully (9000 is namenodes services port)
Are there more logs?
I find a problem when I try to use hadoop hdfs command:
root#ec2-35-205-125-85:~# hdfs dfs -copyFromLocal ~/input/ ~/input/
copyFromLocal: Call From ip-172-32-5-110.us-west-2.compute.internal/172.32.5.110 to localhost:54310 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
This problem happen not just for -copyFromLocal but for all command start with hdfs, for example -ls, -mkdir.....
After the following code the problem solved:
bash /usr/local/hadoop/sbin/start-all.sh
And after this code the all subnodes should've run, I check it with jps, it show the following:
2033 SecondaryNameNode
2778 Jps
2325 NodeManager
2195 ResourceManager
1691 NameNode
After that run:
hdfs namenode -format
And then the warning message has gone.
I Set up hadoop 2.6 cluster using two nodes of 8 cores each on Ubuntu 12.04. sbin/start-dfs.sh and sbin/start-yarn.sh both succeed. And I can see the following after jps on the master node.
22437 DataNode
22988 ResourceManager
24668 Jps
22748 SecondaryNameNode
23244 NodeManager
The jps outcome on the slave node is
19693 DataNode
19966 NodeManager
I then run the PI example.
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 30 100
Which gives me there error-log
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "Master-R5-Node/xxx.ww.y.zz"; destination host is: "Master-R5-Node":54310;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
The problem seems with the HDFS file system since trying out the command bin/hdfs dfs -mkdir /user fails with the similar exception.
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "Master-R5-Node/xxx.ww.y.zz"; destination host is: "Master-R5-Node":54310;
where xxx.ww.y.zz is the ip-address of Master-R5-Node
I have checked and followed all the recommendations of ConnectionRefused on Apache and on this site.
Despite the week long effort, I cannot get it fixed.
Thanks.
There are so many reasons to what may lead to the problem I faced. But I finally ended up fixing it using some of the following things.
Make sure that you have the needed permission to the /hadoop and hdfs temporary files. (you have to figure out where that is for your paticular case)
remove the port number from fs.defaultFS in $HADOOP_CONF_DIR/core-site.xml. It should look like this:
`<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://my.master.ip.address/</value>
<description>NameNode URI</description>
</property>
</configuration>`
Add the following two properties to `$HADOOP_CONF_DIR/hdfs-site.xml
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
Voila! You should now be up and running!
After a new hadoop single node installation , I got following error in hadoop-root-datanode-localhost.localdomain.log
2014-06-18 23:43:23,594 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root cause:java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused
2014-06-18 23:43:23,595 INFO org.apache.hadoop.mapred.JobTracker: Problem connecting to HDFS Namenode... re-trying java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refusedat org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
Any idea.?
JPS is not giving any ouput
Core site.xml is updated
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/surya/hadoop-1.2.1/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
Also , on format using hadoop namenode -format
got below aborted error
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y
Format aborted in /tmp/hadoop-root/dfs/name
You need to run hadoop namenode -format as the hdfs-superuser. Probably the "hdfs" user itself.
The hint can be seen here:
UserGroupInformation: PriviledgedActionException as:root cause:java
Another thing to consider: You really want to move your hdfs root to something other than /tmp. You will risk losing your hdfs contents when /tmp is cleaned (which could happen any time)
UPDATE based on OP comments.
RE: JobTracker unable to contact NameNode: Please do not skip steps.
First make sure you format the NameNode
Then start the NameNode and DataNodes
Run some basic HDFS commands such as
hdfs dfs -put
and
hdfs dfs -get
Then you can start the JobTracker and TaskTracker
Then (and not earlier) you can try to run some MapReduce job (which uses hdfs)
1) Please run "jps" in console and show what it outputs
2) Please provide core-site.xml (I think you might have wrong fs.default.name)
Concerning this error:
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y
Format aborted in /tmp/hadoop-root/dfs/name
You need to use a capital Y, not a lowercase y in order for it to accept the input and actually do the formatting.
I am running Hadoop-1.2.1 and HBase-0.94.11 in a pseudo-distributed mode.
Due to power failure Hadoop and HBase set up went down.Next time when I restarted my machine and the pseudo-distribution set up, HBase stopped working with the following errors on HBase shell:
13/11/27 13:53:27 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
13/11/27 13:53:27 WARN zookeeper.ZKUtil: hconnection Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)
at org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
at org.apache.hadoop.hbase.zookeeper.ClusterId.getId(ClusterId.java:50)
at org.apache.hadoop.hbase.zookeeper.ClusterId.hasId(ClusterId.java:44)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:720)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:789)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:129)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
Following are the processes :
hduser#user-ubuntu:~$ jps
16914 NameNode
19955 Jps
29460 Main
17728 TaskTracker
19776 HMaster
17490 JobTracker
17392 SecondaryNameNode
Are you sure your Zookeeper process is running (your jps listing doesn't show an entry for QuorumPeerMain)? The jps stack may not show all java processes running - try using a ps axww | grep QuorumPeerMain.
If your zookeeper refuses to start, check its logs to see if there are some stack trace clues
It's straightforward the zookeeper quorum process is not running - if it has been, there'd have been another java process:
hduser#user-ubuntu:~$ jps
16914 NameNode
19955 Jps
29460 Main
17728 TaskTracker
19776 HMaster
17490 JobTracker
17392 SecondaryNameNode
xxxxx HQuorumPeer
Zookeeper is required for HBase cluster - as it manages it.
Possible solutions:
By default HBase manages zookeeper itself i.e. starting and stopping the zookeeper quorum (the cluster of zookeeper nodes) - to verify the settings look into the file conf/hbase-evn.sh (in your hbase directory) there must be a line:
export HBASE_MANAGES_ZK=true
Basically tells HBase whether it should manage its own instance of Zookeeper or not. In case it is set to false, edit to true.
Also verify the HBase conf at conf/hbase-site.xml,
The minimum conf that should work for pseudo-distributed mode is:
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/<yourusername>/zookeeper</value>
</property>
</configuration>
Now stop the HBase, if it's been running:
$ ./bin/stop-hbase.sh
make the neccessary changes and start it again:
$ ./bin/start-hbase.sh
Answers you may find helpful:1 2