After a new hadoop single node installation , I got following error in hadoop-root-datanode-localhost.localdomain.log
2014-06-18 23:43:23,594 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root cause:java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refused
2014-06-18 23:43:23,595 INFO org.apache.hadoop.mapred.JobTracker: Problem connecting to HDFS Namenode... re-trying java.net.ConnectException: Call to localhost/127.0.0.1:54310 failed on connection exception: java.net.ConnectException: Connection refusedat org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
Any idea.?
JPS is not giving any ouput
Core site.xml is updated
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/surya/hadoop-1.2.1/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
Also , on format using hadoop namenode -format
got below aborted error
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y
Format aborted in /tmp/hadoop-root/dfs/name
You need to run hadoop namenode -format as the hdfs-superuser. Probably the "hdfs" user itself.
The hint can be seen here:
UserGroupInformation: PriviledgedActionException as:root cause:java
Another thing to consider: You really want to move your hdfs root to something other than /tmp. You will risk losing your hdfs contents when /tmp is cleaned (which could happen any time)
UPDATE based on OP comments.
RE: JobTracker unable to contact NameNode: Please do not skip steps.
First make sure you format the NameNode
Then start the NameNode and DataNodes
Run some basic HDFS commands such as
hdfs dfs -put
and
hdfs dfs -get
Then you can start the JobTracker and TaskTracker
Then (and not earlier) you can try to run some MapReduce job (which uses hdfs)
1) Please run "jps" in console and show what it outputs
2) Please provide core-site.xml (I think you might have wrong fs.default.name)
Concerning this error:
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y
Format aborted in /tmp/hadoop-root/dfs/name
You need to use a capital Y, not a lowercase y in order for it to accept the input and actually do the formatting.
Related
How can I figure out the URI my hdfs dfs commands are connecting to?
Is there any configuration file that stores the URI or any command that can be used to display it?
I looked into the documention of FileSystemShell and the dfsadmin documentation without success. (Also, I do not have access to most of dfsadmin commands.)
When I call a command with hdfs:///user/myUserName/... it throws the exception:
Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no host: hdfs:///user/myUserName/test.avro
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.avro.mapred.FsInput.<init>(FsInput.java:38)
at org.apache.avro.tool.Util.openSeekableFromFS(Util.java:110)
at org.apache.avro.tool.DataFileGetSchemaTool.run(DataFileGetSchemaTool.java:47)
at org.apache.avro.tool.Main.run(Main.java:87)
at org.apache.avro.tool.Main.main(Main.java:76)
Simple commands like hdfs dfs -ls are working fine.
Using Hadoop 3.1.0.
If you're able to access the file core-site.xml, then you can look for the value assigned to property fs.defaultFS
$ grep -A 2 defaultFS /etc/hadoop/conf/core-site.xml
<name>fs.defaultFS</name>
<value>hdfs://bigdataserver-2.internal.cloudapp.net:8020</value>
</property>
Note: I use Cloudera and core-site.xml is where i get the detail. For Hadoop, you might be having core-default.xml
Check this out: https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/core-default.xml
I'm very new to hadoop, so I've started following the hadoop 2.9.2 getting started. When I run the command
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'
it returns a success, but when I look at the output/part-r-00000.txt file, which is meant to show the result, it is empty, even though the input directory contains the .xml files of etc/hadoop as it is supposed to.
I've started the whole process over and over again, reading all the logs, in order to understand where the error might be. Anyway, when I run the bin/hdfs namenode -format, it shows me this error:
ERROR common.Util: Syntax error in URI file://path to temp_directory/dfs/name. Please check hdfs configuration.
java.net.URISyntaxException: Illegal character in authority at index 7: file://path to temp_directory/dfs/name
at java.base/java.net.URI$Parser.fail(URI.java:2915)
at java.base/java.net.URI$Parser.parseAuthority(URI.java:3249)
at java.base/java.net.URI$Parser.parseHierarchical(URI.java:3160)
at java.base/java.net.URI$Parser.parse(URI.java:3116)
at java.base/java.net.URI.<init>(URI.java:600)
at org.apache.hadoop.hdfs.server.common.Util.stringAsURI(Util.java:49)
at org.apache.hadoop.hdfs.server.common.Util.stringCollectionAsURIs(Util.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getStorageDirs(FSNamesystem.java:1466)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNamespaceEditsDirs(FSNamesystem.java:1511)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNamespaceEditsDirs(FSNamesystem.java:1480)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1137)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1614)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1741)
and also this occurs when I run bin/hdfs dfs -put etc/hadoop input:
WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/federico/input/hadoop/capacity-scheduler.xml._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
it seems pretty clear that there are no datanodes running. So, assumed this situation, how can I initialize a datanode to make things work, and how do I know if my datanode is running as it is expected to?
EDIT: I've tried to follow some suggestion fro different users experiencing a similar problem and tihs error came out:
WARN org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker: Exception checking StorageLocation [DISK]file:/dfs/data
java.io.FileNotFoundException: File file:/dfs/data does not exist
and thus the datanode creation fails. How do I deal with it?
Please update you hdfs-site.xml as follows where dfs.datanode.data.dir value should be set as per your expectations. You can find this file in /etc/hadoop under Hadoop installation directory.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/Users/myname/data/hdfs/data</value>
</property>
</configuration>
Use similar paths for linux as /home/myname/data/hdfs/data
I Set up hadoop 2.6 cluster using two nodes of 8 cores each on Ubuntu 12.04. sbin/start-dfs.sh and sbin/start-yarn.sh both succeed. And I can see the following after jps on the master node.
22437 DataNode
22988 ResourceManager
24668 Jps
22748 SecondaryNameNode
23244 NodeManager
The jps outcome on the slave node is
19693 DataNode
19966 NodeManager
I then run the PI example.
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 30 100
Which gives me there error-log
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "Master-R5-Node/xxx.ww.y.zz"; destination host is: "Master-R5-Node":54310;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
The problem seems with the HDFS file system since trying out the command bin/hdfs dfs -mkdir /user fails with the similar exception.
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "Master-R5-Node/xxx.ww.y.zz"; destination host is: "Master-R5-Node":54310;
where xxx.ww.y.zz is the ip-address of Master-R5-Node
I have checked and followed all the recommendations of ConnectionRefused on Apache and on this site.
Despite the week long effort, I cannot get it fixed.
Thanks.
There are so many reasons to what may lead to the problem I faced. But I finally ended up fixing it using some of the following things.
Make sure that you have the needed permission to the /hadoop and hdfs temporary files. (you have to figure out where that is for your paticular case)
remove the port number from fs.defaultFS in $HADOOP_CONF_DIR/core-site.xml. It should look like this:
`<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://my.master.ip.address/</value>
<description>NameNode URI</description>
</property>
</configuration>`
Add the following two properties to `$HADOOP_CONF_DIR/hdfs-site.xml
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
Voila! You should now be up and running!
When I am running the following query in hive:
hive> select count(*) from testsql;
I am getting the following error:
Error
FAILED: RuntimeException java.net.ConnectException: Call From impetus-1466/192.168.49.77 to impetus-1466:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
The jps looks like:
[impadmin#impetus-1466 hadoop-1.0.3.15]$ jps
26380 TaskTracker
26709 Jps
26230 JobTracker
25943 NameNode
I started the
$ start-all.sh
$ start-dfs.sh
$ start-mapred.sh
How could this be solved?
Thanks
If you can open the http://localhost:8088/cluster but can't open http://localhost:50070/. Maybe datanode didn't start-up or namenode didn't formated.
And check hadoop.tmp.dir in core-site.xml, if it is not set, the default directory of it is /tmp, so set hadoop.tmp.dir in core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/path/to/hadoop/tmp</value>
</property>
Then stop hadoop and reformat hdfs namenode -format, then restart the hadoop.
Similar question http://localhost:50070 does not work HADOOP
The reason for this is, either there are no datanodes in your cluster or the datanodes do not know their namenode. This might be the result of namenode format at least twice. The cluster id of namenode got changed but this change was not reflected to the datanodes.
The below links might be helpful:
Datanode not starts correctly
http://hortonworks.com/community/forums/topic/clusterid-mismatch-for-namenode-and-datanodes-in-fully-distributed-cluster/
As a start, I've installed Hadoop (0.15.2) and setup a cluster of 3 nodes: one each for NameNode, DataNode and the JobTracker. All the daemons are up and running. But when I issue any command I get the above error. For instance, when I do a copyFromLocal, I get the following error:
Am I missing something?
More details:
I am trying to install Hadoop on an NFS file system. I've installed 1.0.4 version and tried running it but to of no avail. The 1.0.4 version doesn't start the datanode. And the log files for the datanode are empty. Hence I switched back to 0.15 version which started all the daemons atleast.
I believe the problem is due to the underlying NFS file system i.e. all the datanodes and masters using the same files and folders. But I am not sure if that is actually the case.
But I don't see any reason why I shouldn't be able to run Hadoop on NFS (after appropriately setting the configuration parameters).
Currently I am trying and figuring out if I could set the name and data directories differently for different machines based on the individual machine names.
Configuration file: (hadoop-site.xml)
<property>
<name>fs.default.name</name>
<value>mumble-12.cs.wisc.edu:9001</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>mumble-13.cs.wisc.edu:9001</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.secondary.info.port</name>
<value>9002</value>
</property>
<property>
<name>dfs.info.port</name>
<value>9003</value>
</property>
<property>
<name>mapred.job.tracker.info.port</name>
<value>9004</value>
</property>
<property>
<name>tasktracker.http.port</name>
<value>9005</value>
</property>
Error using Hadoop 1.0.4 (DataNode doesn't get started):
2013-04-22 18:50:50,438 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001, call addBlock(/tmp/hadoop-akshar/mapred/system/jobtracker.info, DFSClient_502734479, null) from 128.105.112.13:37204: error: java.io.IOException: File /tmp/hadoop-akshar/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-akshar/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
Error using Hadoop 0.15.2:
[akshar#mumble-12] (38)$ bin/hadoop fs -copyFromLocal lib/junit-3.8.1.LICENSE.txt input
13/04/17 03:22:11 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
13/04/17 03:22:12 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
13/04/17 03:22:12 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readShort(DataInputStream.java:312)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1660)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:120)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1360)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1478)
copyFromLocal: Connection reset
I was able to get Hadoop to run over NFS using version 1.1.2. It might work for other versions, but I can't guarantee anything.
If you have an NFS file system then each node should have access to the filesystem. The fs.default.name tells Hadoop the filesystem URI to use, so it should be pointed to the local disk. I'll assume that your NFS directory is mounted to each node at /nfs.
In core-site.xml you should define:
<property>
<name>fs.default.name</name>
<value>file:///</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/nfs/tmp</value>
</property>
In mapred-site.xml you should define:
<property>
<name>mapred.job.tracker</name>
<value>node1:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/tmp/mapred-local</value>
</property>
Since hadoop.tmp.dir is pointed to the nfs drive then the default locations of mapred.system.dir and mapreduce.jobtracker.staging.root.dir point to locations on the nfs drive. It might run with leaving the default value for mapred.local.dir, but it is supposed to point to the local filesystem so to be safe you can put that in /tmp.
You don't have to worry about hdfs-site.xml. This configuration file is used when you start the namenode, but with everything being distributed on the nfs drive you shouldn't run HDFS.
Now you can run start-mapred.sh on the jobtracker node and run a hadoop job. Don't run start-all.sh or start-dfs.sh because those will start HDFS. If you run multiple DataNodes that point to the same NFS directory, then one DataNode will lock that directory and the others will shutdown because they are unable to obtain a lock.
I tested the configuration with:
bin/hadoop jar hadoop-examples-1.1.2.jar wordcount /nfs/data/test.text /nfs/out
Note that you need to specify full paths to the input and output locations.
I also tried:
bin/hadoop jar hadoop-examples-1.1.2.jar grep /nfs/data/loremIpsum.txt /nfs/out2 lorem
It gave me the same output as when I run it in Standalone, so I assume it is performing correctly.
Here is more information on fs.default.name:
http://www.greenplum.com/blog/dive-in/usage-and-quirks-of-fs-default-name-in-hadoop-filesystem