Spark 1.2 cannot connect to HDFS on HDP 2.2 - hadoop

I follow this tour http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/ to install Spark on HDP 2.2.
But it tells me that dfs refued my connection!
What I command:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples*.jar 10
Here is the log:
tput: No value for $TERM and no -T specified
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/04 13:52:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/04 13:52:52 INFO impl.TimelineClientImpl: Timeline service address: http://amb7.a.b.c:8188/ws/v1/timeline/
15/02/04 13:52:53 INFO client.RMProxy: Connecting to ResourceManager at amb7.a.b.c/172.0.22.8:8050
15/02/04 13:52:53 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers
15/02/04 13:52:53 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container)
15/02/04 13:52:53 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/02/04 13:52:53 INFO yarn.Client: Setting up container launch context for our AM
15/02/04 13:52:53 INFO yarn.Client: Preparing resources for our AM container
15/02/04 13:52:54 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
15/02/04 13:52:54 INFO yarn.Client: Uploading resource file:/tmp/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar -> hdfs://amb1.a.b.c:8020/user/hdfs/.sparkStaging/application_1423073070725_0007/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar
15/02/04 13:52:54 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1611)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1409)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1362)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)
15/02/04 13:52:54 INFO hdfs.DFSClient: Abandoning BP-470883394-172.0.91.7-1423072968591:blk_1073741885_1061
15/02/04 13:52:54 INFO hdfs.DFSClient: Excluding datanode 172.0.11.0:50010
15/02/04 13:52:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1611)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1409)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1362)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)
15/02/04 13:52:55 INFO hdfs.DFSClient: Abandoning BP-470883394-172.0.91.7-1423072968591:blk_1073741886_1062
15/02/04 13:52:55 INFO hdfs.DFSClient: Excluding datanode 172.0.81.0:50010
15/02/04 13:52:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1611)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1409)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1362)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)
15/02/04 13:52:55 INFO hdfs.DFSClient: Abandoning BP-470883394-172.0.91.7-1423072968591:blk_1073741887_1063
15/02/04 13:52:55 INFO hdfs.DFSClient: Excluding datanode 172.0.7.0:50010
15/02/04 13:52:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1611)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1409)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1362)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)
15/02/04 13:52:55 INFO hdfs.DFSClient: Abandoning BP-470883394-172.0.91.7-1423072968591:blk_1073741888_1064
15/02/04 13:52:55 INFO hdfs.DFSClient: Excluding datanode 172.0.65.0:50010
15/02/04 13:52:55 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1375)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)
15/02/04 13:52:55 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/hdfs/.sparkStaging/application_1423073070725_0007/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar" - Aborting...
Exception in thread "main" java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1611)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1409)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1362)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)
Somebody says this (http://tiku.io/questions/4795653/unable-to-run-spark-1-0-sparkpi-on-hdp-2-0), but I can't understand.
My Env is
HDP: 2.2
Ambari: 1.7
Spark 1.2

you must modify the parameter that manages Scheduler, I increase it according to available memory capacity in your cluster.
In the yarn setting:
yarn.scheduler.maximun-allocation-mb = 3072 (or an amount greater)

Related

ERROR streaming.StreamJob: Job not successful

I install Hadoop 2.9.0 and I have 4 nodes. The namenode and resourcemanager services are running on master and datanodes and nodemanagers are running on slaves. Now I wanna run a python MapReduce job. But Job not successful!
Please tell me what should I do?
Log of running of job in terminal:
hadoop#hadoopmaster:/usr/local/hadoop$ bin/hadoop jar share/hadoop/tools/lib/hadoop-streaming-2.9.0.jar -file mapper.py -mapper mapper.py -file reducer.py -reducer reducer.py -input /user/hadoop/* -output /user/hadoop/output
18/06/17 04:26:28 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
18/06/17 04:26:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
packageJobJar: [mapper.py, /tmp/hadoop-unjar3316382199020742755/] [] /tmp/streamjob4930230269569102931.jar tmpDir=null
18/06/17 04:26:28 INFO client.RMProxy: Connecting to ResourceManager at hadoopmaster/192.168.111.175:8050
18/06/17 04:26:29 INFO client.RMProxy: Connecting to ResourceManager at hadoopmaster/192.168.111.175:8050
18/06/17 04:26:29 WARN hdfs.DataStreamer: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
at java.lang.Thread.join(Thread.java:1323)
at org.apache.hadoop.hdfs.DataStreamer.closeResponder(DataStreamer.java:980)
at org.apache.hadoop.hdfs.DataStreamer.endBlock(DataStreamer.java:630)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:807)
18/06/17 04:26:29 WARN hdfs.DataStreamer: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
at java.lang.Thread.join(Thread.java:1323)
at org.apache.hadoop.hdfs.DataStreamer.closeResponder(DataStreamer.java:980)
at org.apache.hadoop.hdfs.DataStreamer.endBlock(DataStreamer.java:630)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:807)
18/06/17 04:26:29 INFO mapred.FileInputFormat: Total input files to process : 4
18/06/17 04:26:29 WARN hdfs.DataStreamer: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
at java.lang.Thread.join(Thread.java:1323)
at org.apache.hadoop.hdfs.DataStreamer.closeResponder(DataStreamer.java:980)
at org.apache.hadoop.hdfs.DataStreamer.endBlock(DataStreamer.java:630)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:807)
18/06/17 04:26:29 INFO mapreduce.JobSubmitter: number of splits:4
18/06/17 04:26:29 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
18/06/17 04:26:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1529233655437_0004
18/06/17 04:26:30 INFO impl.YarnClientImpl: Submitted application application_1529233655437_0004
18/06/17 04:26:30 INFO mapreduce.Job: The url to track the job: http://hadoopmaster.png.com:8088/proxy/application_1529233655437_0004/
18/06/17 04:26:30 INFO mapreduce.Job: Running job: job_1529233655437_0004
18/06/17 04:45:12 INFO mapreduce.Job: Job job_1529233655437_0004 running in uber mode : false
18/06/17 04:45:12 INFO mapreduce.Job: map 0% reduce 0%
18/06/17 04:45:12 INFO mapreduce.Job: Job job_1529233655437_0004 failed with state FAILED due to: Application application_1529233655437_0004 failed 2 times due to Error launching appattempt_1529233655437_0004_000002. Got exception: org.apache.hadoop.net.ConnectTimeoutException: Call From hadoopmaster.png.com/192.168.111.175 to hadoopslave1.png.com:40569 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=hadoopslave1.png.com/192.168.111.173:40569]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout
at sun.reflect.GeneratedConstructorAccessor38.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:774)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.Client.call(Client.java:1349)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy82.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy83.startContainers(Unknown Source)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:122)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:307)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=hadoopslave1.png.com/192.168.111.173:40569]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:687)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:790)
at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:411)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1554)
at org.apache.hadoop.ipc.Client.call(Client.java:1385)
... 19 more
. Failing the application.
18/06/17 04:45:13 INFO mapreduce.Job: Counters: 0
18/06/17 04:45:13 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
OK. I found the reason of problem. In fact, the following error should be resolved:
hadoopmaster.png.com/192.168.111.175 to hadoopslave1.png.com:40569
failed on socket timeout exception
So just did:
sudo ufw allow 40569

Spark HDFS Exception in createBlockOutputStream while uploading resource file

I'm trying to run my JAR in the cluster with yarn-cluster but i'm getting an exception after a while. The last INFO before it fails is Uploading resource. I've check all the security groups, did hsdf ls with success but still getting the error.
./bin/spark-submit --class MyMainClass --master yarn-cluster /tmp/myjar-1.0.jar myjarparameter
16/01/21 16:13:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/21 16:13:52 INFO client.RMProxy: Connecting to ResourceManager at yarn.myserver.com/publicip:publicport
16/01/21 16:13:53 INFO yarn.Client: Requesting a new application from cluster with 10 NodeManagers
16/01/21 16:13:53 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (13312 MB per container)
16/01/21 16:13:53 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/01/21 16:13:53 INFO yarn.Client: Setting up container launch context for our AM
16/01/21 16:13:53 INFO yarn.Client: Preparing resources for our AM container
16/01/21 16:13:54 INFO yarn.Client: Uploading resource file:/opt/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar -> hdfs://hdfs.myserver.com/user/henrique/.sparkStaging/application_1452514285349_6427/spark-assembly-1.2.0-hadoop2.3.0.jar
16/01/21 16:14:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/PRIVATE_IP:50010]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1341)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1167)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1122)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:522)
16/01/21 16:14:55 INFO hdfs.DFSClient: Abandoning BP-26920217-10.140.213.58-1440247331237:blk_1132201932_58466886
16/01/21 16:14:55 INFO hdfs.DFSClient: Excluding datanode 10.164.16.207:50010
16/01/21 16:15:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
./bin/hadoop fs -ls /user/henrique/.sparkStaging/
drwx------- henrique supergroup 0 2016-01-20 18:36 user/henrique/.sparkStaging/application_1452514285349_5868
drwx------ henrique supergroup 0 2016-01-21 16:13 user/henrique/.sparkStaging/application_1452514285349_6427
drwx------ henrique supergroup 0 2016-01-21 17:06 user/henrique/.sparkStaging/application_1452514285349_6443
SOLVED! Hadoop was trying to connect to private IPs. The problem was solved by adding this config to hsdf-site.xml
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
</property>

Data node demon not running on CDH 4.2.1 pseudo distributed mode

I am running hadoop-2.0.0-cdh4.2.1 on CentOS in pseudo deistributed mode. When I issued the command sudo jps I don't see datanode demon up and running.
Below is the error log that I got in log file http://localhost:50070/logs/hadoop-hdfs-datanode-localhost.localdomain.log
in NameNode:
**2015-05-12 04:35:26,319 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539) service to /0.0.0.0:8020 beginning handshake with NN
2015-05-12 04:35:28,573 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539) service to 0.0.0.0/0.0.0.0:8020
java.io.IOException: Failed on local exception: java.io.IOException: Connection reset by peer; Host Details : local host is: "localhost.localdomain/127.0.0.1"; destination host is: "0.0.0.0":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:760)
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy10.registerDatanode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy10.registerDatanode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:149)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:619)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:221)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
at sun.nio.ch.IOUtil.read(IOUtil.java:171)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:56)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:156)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:409)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.FilterInputStream.read(FilterInputStream.java:66)
at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:276)
at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:938)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:836)
2015-05-12 04:35:28,578 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539) service to 0.0.0.0/0.0.0.0:8020
2015-05-12 04:35:28,595 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-539882958-127.0.0.1-1386722652683 (storage id DS-1842390259-127.0.0.1-50010-1431419699539)
2015-05-12 04:35:28,595 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Removed bpid=BP-539882958-127.0.0.1-1386722652683 from blockPoolScannerMap
2015-05-12 04:35:28,595 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-539882958-127.0.0.1-1386722652683
2015-05-12 04:35:30,597 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2015-05-12 04:35:30,600 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2015-05-12 04:35:30,603 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at localhost.localdomain/127.0.0.1
************************************************************/**

NodeManager starts and stops in Slave Node

In my 2 node cluster, when I try to run /start-yarn.sh it starts NodeManager on both master and slave nodes. But, in slave node, it gets killed after some time. Here is what is found in log file:
2014-09-08 13:37:12,224 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService is stopped
2014-09-08 13:37:12,224 INFO org.apache.hadoop.yarn.service.AbstractService: Srvvice:org.apache.hadoop.yarn.server.nodemanager.DeletionService is stopped
2014-09-08 13:37:12,224 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.se$
at org.apache.hadoop.yarn.service.CompositeService.start(CompositeServi$
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeMana$
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNo$
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManag$
Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredTh$
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.star$
at org.apache.hadoop.yarn.service.CompositeService.start(CompositeServi$
... 3 more
Caused by: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBCl$
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.regi$
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.star$
... 4 more
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Cal$
at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(P$
at com.sun.proxy.$Proxy24.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBCl$
... 6 more
Caused by: java.net.ConnectException: Call From slave1/10.255.255.35 to 0.0.0.0$
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:700)
at org.apache.hadoop.ipc.Client.call(Client.java:1099)
at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(P$
... 8 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:70$
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeou$
2014-09-08 13:37:12,239 INFO org.apache.hadoop.ipc.Server: Stopping server on 4
2014-09-08 13:37:12,240 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
2014-09-08 13:37:12,240 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
2014-09-08 13:37:12,241 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
2014-09-08 13:37:12,241 INFO org.apache.hadoop.yarn.server.nodemanager.NodeMana
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at slave1/10.255.255.35
************************************************************/
Any help?

HADOOP copyFromLocal DataStreamer exception

I'm using Hadoop 0.20.203, and I have a cluster with nodes 0 ~ 24. cluster0 is used as a NameNode, and all others are currently used as DataNodes.
I'm currently trying to execute WordCount example, however when I try to -copyFromLocal into DFS, following message is shown :
aqjune#cluster0:~>> $HADOOP_HOME/bin/hadoop dfs -copyFromLocal pg132.txt /user/aqjune/input/pg132.txt
14/06/17 19:54:01 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:01 INFO hdfs.DFSClient: Abandoning block blk_-7530678618792869516_1003
14/06/17 19:54:07 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:07 INFO hdfs.DFSClient: Abandoning block blk_-7462751912508683911_1003
14/06/17 19:54:13 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:13 INFO hdfs.DFSClient: Abandoning block blk_252255837066920011_1003
14/06/17 19:54:19 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
14/06/17 19:54:19 INFO hdfs.DFSClient: Abandoning block blk_4030900909035905642_1003
14/06/17 19:54:25 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3002)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
14/06/17 19:54:25 WARN hdfs.DFSClient: Error Recovery for block blk_4030900909035905642_1003 bad datanode[0] nodes == null
14/06/17 19:54:25 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/aqjune/input/pg132.txt" - Aborting...
copyFromLocal: Connection refused
14/06/17 19:54:25 ERROR hdfs.DFSClient: Exception closing file /user/aqjune/input/pg132.txt : java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3028)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2983)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
And then only empty file was created ;
aqjune#grey0:~/hadoop>> bin/hadoop dfs -lsr /
drwxr-xr-x - aqjune supergroup 0 2014-06-17 19:45 /user
drwxr-xr-x - aqjune supergroup 0 2014-06-17 19:45 /user/aqjune
drwxr-xr-x - aqjune supergroup 0 2014-06-17 19:54 /user/aqjune/input
-rw-r--r-- 1 aqjune supergroup 0 2014-06-17 19:54 /user/aqjune/input/pg132.txt
I can't figure out the cause of this problem. Can I get some hint for this?

Resources