Spark HDFS Exception in createBlockOutputStream while uploading resource file

I'm trying to run my JAR in the cluster with yarn-cluster but i'm getting an exception after a while. The last INFO before it fails is Uploading resource. I've check all the security groups, did hsdf ls with success but still getting the error.
./bin/spark-submit --class MyMainClass --master yarn-cluster /tmp/myjar-1.0.jar myjarparameter
16/01/21 16:13:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/21 16:13:52 INFO client.RMProxy: Connecting to ResourceManager at
16/01/21 16:13:53 INFO yarn.Client: Requesting a new application from cluster with 10 NodeManagers
16/01/21 16:13:53 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (13312 MB per container)
16/01/21 16:13:53 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/01/21 16:13:53 INFO yarn.Client: Setting up container launch context for our AM
16/01/21 16:13:53 INFO yarn.Client: Preparing resources for our AM container
16/01/21 16:13:54 INFO yarn.Client: Uploading resource file:/opt/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar -> hdfs://
16/01/21 16:14:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/PRIVATE_IP:50010]
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(
at org.apache.hadoop.hdfs.DFSOutputStream$
16/01/21 16:14:55 INFO hdfs.DFSClient: Abandoning BP-26920217-
16/01/21 16:14:55 INFO hdfs.DFSClient: Excluding datanode
16/01/21 16:15:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
./bin/hadoop fs -ls /user/henrique/.sparkStaging/
drwx------- henrique supergroup 0 2016-01-20 18:36 user/henrique/.sparkStaging/application_1452514285349_5868
drwx------ henrique supergroup 0 2016-01-21 16:13 user/henrique/.sparkStaging/application_1452514285349_6427
drwx------ henrique supergroup 0 2016-01-21 17:06 user/henrique/.sparkStaging/application_1452514285349_6443

SOLVED! Hadoop was trying to connect to private IPs. The problem was solved by adding this config to hsdf-site.xml


spark in running but yarn can't find it

I have a problem about yarn cluster
I run hdfs-namenode, hdfs-datanode, yarn at localhost and then run a spark-master and a spark-worker at localhost too, see like this:
$ jps
5809 Main
53730 ResourceManager
53540 SecondaryNameNode
53125 NameNode
56710 Master
54009 NodeManager
56809 Worker
53308 DataNode
56911 Jps
I can see spark-worker is link to spark-master throw
img : spark-web-ui
[![enter image description here][1]][1]
But in yarn web-ui, there is nothing in Nodes of the cluster page
img :
[![enter image description here][2]][2]
My conf/ is
export SCALA_HOME="/opt/scala-2.11.8/"
export JAVA_HOME="/opt/jdk1.8.0_101/"
export HADOOP_HOME="/opt/hadoop-2.7.3/"
export HADOOP_CONF_DIR="/opt/hadoop-2.7.3/etc/hadoop/"
export SPARK_LOCAL_DIRS="/opt/spark-2.0.0-bin-hadoop2.7/"
And conf/spark-defaults.conf is
spark.master spark://
spark.yarn.submit.waitAppCompletion false
spark.yarn.access.namenodes hdfs://
And yarn-site.xml is
When I submit an application use
spark-submit --master yarn --deploy-mode cluster
I can get out put like this
16/10/12 16:19:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/12 16:19:30 INFO client.RMProxy: Connecting to ResourceManager at /
16/10/12 16:19:30 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/10/12 16:19:30 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/10/12 16:19:30 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
16/10/12 16:19:30 INFO yarn.Client: Setting up container launch context for our AM
16/10/12 16:19:30 INFO yarn.Client: Setting up the launch environment for our AM container
16/10/12 16:19:30 INFO yarn.Client: Preparing resources for our AM container
16/10/12 16:19:31 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/10/12 16:19:32 INFO yarn.Client: Uploading resource file:/opt/spark-2.0.0-bin-hadoop2.7/spark-3cdb2435-d6a0-4ce0-a54a-f2849d5f4909/ -> hdfs://
16/10/12 16:19:33 INFO yarn.Client: Uploading resource file:/home/fuxiuyin/PycharmProjects/spark-test/ -> hdfs://
16/10/12 16:19:33 INFO yarn.Client: Uploading resource file:/opt/spark-2.0.0-bin-hadoop2.7/python/lib/ -> hdfs://
16/10/12 16:19:33 INFO yarn.Client: Uploading resource file:/opt/spark-2.0.0-bin-hadoop2.7/python/lib/ -> hdfs://
16/10/12 16:19:33 INFO yarn.Client: Uploading resource file:/opt/spark-2.0.0-bin-hadoop2.7/spark-3cdb2435-d6a0-4ce0-a54a-f2849d5f4909/ -> hdfs://
16/10/12 16:19:33 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/12 16:19:33 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/12 16:19:33 INFO spark.SecurityManager: Changing view acls groups to:
16/10/12 16:19:33 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/12 16:19:33 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/12 16:19:33 INFO yarn.Client: Submitting application application_1476256306830_0002 to ResourceManager
16/10/12 16:19:33 INFO impl.YarnClientImpl: Submitted application application_1476256306830_0002
16/10/12 16:19:33 INFO yarn.Client: Application report for application_1476256306830_0002 (state: ACCEPTED)
16/10/12 16:19:33 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1476260373944
final status: UNDEFINED
tracking URL: http://localhost:8088/proxy/application_1476256306830_0002/
user: fuxiuyin
16/10/12 16:19:33 INFO util.ShutdownHookManager: Shutdown hook called
16/10/12 16:19:33 INFO util.ShutdownHookManager: Deleting directory /opt/spark-2.0.0-bin-hadoop2.7/spark-3cdb2435-d6a0-4ce0-a54a-f2849d5f4909
It's success, but in yarn web-ui, this app isn't running always in ACCEPTED
Looks like no spark node run this app.
Can anyone tell me what's wrong?
You can specify one type of cluster:
YARN (cluster or client mode)
Spark standalone
You have started Spark standalone server and you're connecting to this cluster manager. If you want to start Spark on YARN, you should specify yarn master - just --master yarn
Please add logs and spark-submit command. Please also post how are you launching YARN. If first attempt was wrong, then it means you have configuration problem
Edit number 2: It seems that YARN doesn't have enough resources to process your application. Please check your config, i.e. check if maybe increasing yarn.nodemanager.resource.memory-mb will help. Also you can go to Spark Web UI - http://application-master-ip:4040 - and see information from Spark Context.
Also, you can check if you can deploy application to Spark Standalone (which you are also starting), just by setting --master spark://...: as in configuration. Then you will be sure if it is a problem with YARN or in Spark
BTW. You can omit running Spark Standalone if you're submitting to YARN :) And memory used by Stanalone Workers can be used by YARN
:). Thanks everyone, I'm so sorry to waste your time. When I check the resource in http://localhost:8088/ I noticed this:
I just stop the server and delete tmp directory and logs directory. Then it works.
Thank you again

Spark: Unknown/unsupported param error when setting conf.yarn.jar

I have a little application that runs fine on my Spark cluster based on Yarn when I commit it with spark-submit like this:
~/spark-1.4.0-bin-hadoop2.4$ bin/spark-submit --class MyClass --master yarn-cluster --queue testing myApp.jar hdfs://nameservice1/user/XXX/README.md_count
However, I would like to avoid uploading the spark-assembly.jar file each time, so I set the spark.yarn.jar configuration parameter:
~/spark-1.4.0-bin-hadoop2.4$ bin/spark-submit --class MyClass --master yarn-cluster --queue testing --conf "spark.yarn.jar=hdfs://nameservice1/user/spark/share/lib/spark-assembly.jar" myApp.jar hdfs://nameservice1/user/XXX/README.md_count
This seems to be fine at first:
15/07/08 13:57:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/07/08 13:57:18 INFO yarn.Client: Requesting a new application from cluster with 24 NodeManagers
15/07/08 13:57:18 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/07/08 13:57:18 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/07/08 13:57:18 INFO yarn.Client: Setting up container launch context for our AM
15/07/08 13:57:18 INFO yarn.Client: Preparing resources for our AM container
15/07/08 13:57:18 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs://nameservice1/user/spark/share/lib/spark-assembly.jar
However, it fails eventually:
15/07/08 13:57:18 INFO yarn.Client: Submitting application 670 to ResourceManager
15/07/08 13:57:18 INFO impl.YarnClientImpl: Submitted application application_1434986503384_0670
15/07/08 13:57:19 INFO yarn.Client: Application report for application_1434986503384_0670 (state: ACCEPTED)
15/07/08 13:57:19 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: testing
start time: 1436356638869
final status: UNDEFINED
tracking URL: http://node-00a/cluster/app/application_1434986503384_0670
user: XXX
15/07/08 13:57:20 INFO yarn.Client: Application report for application_1434986503384_0670 (state: ACCEPTED)
15/07/08 13:57:21 INFO yarn.Client: Application report for application_1434986503384_0670 (state: ACCEPTED)
15/07/08 13:57:23 INFO yarn.Client: Application report for application_1434986503384_0670 (state: FAILED)
15/07/08 13:57:23 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1434986503384_0670 failed 2 times due to AM Container for appattempt_1434986503384_0670_000002 exited with exitCode: 1 due to: Exception from container-launch.
Container id: container_1434986503384_0670_02_000001
Exit code: 1
In the Yarn log, I find the following error message indicating a wrong usage of parameters:
Container: container_1434986503384_0670_01_000001 on node-01b_8041
Log Upload Time:Mi Jul 08 13:57:22 +0200 2015
Log Contents:
Unknown/unsupported param List(--arg, hdfs://nameservice1/user/XXX/README.md_count, --executor-memory, 1024m, --executor-cores, 1, --num-executors, 2)
Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
--jar JAR_PATH Path to your application's JAR file (required)
--class CLASS_NAME Name of your application's main class (required)
--args ARGS Arguments to be passed to your application's main class.
Mutliple invocations are possible, each will be passed in order.
--num-executors NUM Number of executors to start (Default: 2)
--executor-cores NUM Number of cores for the executors (Default: 1)
--executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G)
End of LogType:stderr
As the same application runs when uploading the local assembly file upon submission, it seems to come down to the assembly file. Could the one on the cluster be a wrong/different version? How could I validate that? What other reasons might be the cause? Is the warning WARN util.NativeCodeLoader: ... possibly related?
The same happens when I set the (deprecated) environment variable SPARK_JAR instead of setting spark.yarn.jar.
Asking the obvious question here: are you sure the spark-assembly.jar on HDFS is the same one as you have locally? If not, can you try uploading your local spark-assembly to your home directory on HDFS and try again?

Spark 1.3.0: Running Pi example on YARN fails

I have Hadoop with Hive
After building Spark with command:
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests package
I try to run Pi example on YARN with the following command:
export HADOOP_CONF_DIR=/etc/hadoop/conf
/var/home2/test/spark/bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--executor-memory 3G \
--num-executors 50 \
hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \
I get exceptions: application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_000002 exited with exitCode: 1 Which in fact is Diagnostics: Exception from container-launch.(please see log below).
Application tracking url reveals the following messages:
java.lang.Exception: Unknown container. Container either has not started or has already completed or doesn't belong to this node at all
and also:
Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster
I have Hadoop working fine on 4 nodes and completly at a loss how to make Spark work on YARN.
Should I set spark.yarn.access.namenodes Spark configuration property? Though my application does not need to access any name nodes directly, but maybe this will solve the problem?
Please advise where to look for, any ideas would be of great help, thank you!
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service address:
15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at
15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers
15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4096 MB per container)
15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context for our AM
15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM container
15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
15/04/06 10:53:43 INFO yarn.Client: Uploading resource file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar -> hdfs://
15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar
15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment for our AM container
15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to: test
15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to: test
15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(test); users with modify permissions: Set(test)
15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to ResourceManager
15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application application_1427875242006_0029
15/04/06 10:53:45 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:45 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1428317623905
final status: UNDEFINED
tracking URL:
user: test
15/04/06 10:53:46 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:47 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:48 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:49 INFO yarn.Client: Application report for application_1427875242006_0029 (state: FAILED)
15/04/06 10:53:49 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_000002 exited with exitCode: 1
For more detailed output, check application tracking page:, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1427875242006_0029_02_000001
Exit code: 1
Exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/ line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
Stack trace: ExitCodeException exitCode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/ line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
at org.apache.hadoop.util.Shell.runCommand(
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1428317623905
final status: FAILED
tracking URL:
user: test
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
If you are using spark with hdp, then we have to do following things.
Add these entries in your $SPARK_HOME/conf/spark-defaults.conf
spark.driver.extraJavaOptions -Dhdp.version= (your installed HDP version) -Dhdp.version= (your installed HDP version)
create java-opts file in $SPARK_HOME/conf and add the installed HDP version in that file like
-Dhdp.version= (your installed HDP version)
to know hdp verion please run command hdp-select status hadoop-client in the cluster
This is a bug in the HDP - Spark Integration
In your spark-defaults.conf add the following lines
spark.driver.extraJavaOptions -Dhdp.version=–2041 -Dhdp.version=–2041
This should help address the issue
I think your hadoop classpath is not setup.
lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

Hadoop: Datanode process killed

I am currently using Hadoop-2.0.3-alpha and after I could work perfectly with HDFS (copying files into HDFS, getting success from an external framework, using the webfrontend), after a new start of my VM, the datanode process is stopping after a while. The namenode process and all yarn processes work without a problem. I installed Hadoop in a folder under an additional user, as I also still have installed Hadoop 0.2, which worked fine too.
Taking a look at the log-file of all datanode processes I got the following information:
2013-04-11 16:23:50,475 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2013-04-11 16:24:17,451 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
2013-04-11 16:24:23,276 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-11 16:24:23,279 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2013-04-11 16:24:23,480 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is user-VirtualBox
2013-04-11 16:24:28,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /
2013-04-11 16:24:29,239 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2013-04-11 16:24:38,348 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-04-11 16:24:44,627 INFO org.apache.hadoop.http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer$QuotingIn putFilter)
2013-04-11 16:24:45,163 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context datanode
2013-04-11 16:24:45,164 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context logs
2013-04-11 16:24:45,164 INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFil ter$StaticUserFilter) to context static
2013-04-11 16:24:45,355 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at
2013-04-11 16:24:45,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false
2013-04-11 16:24:45,536 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2013-04-11 16:24:45,576 INFO org.mortbay.log: jetty-6.1.26
2013-04-11 16:25:18,416 INFO org.mortbay.log: Started SelectChannelConnector#
2013-04-11 16:25:42,670 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 50020
2013-04-11 16:25:44,955 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /
2013-04-11 16:25:45,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2013-04-11 16:25:47,079 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices: <default>
2013-04-11 16:25:47,660 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/ starting to offer service
2013-04-11 16:25:50,515 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2013-04-11 16:25:50,631 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2013-04-11 16:26:15,068 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoop/workspace/hadoop_space/hadoop23/dfs/data/in_use.lock acquired by nodename 3099#user-VirtualBox
2013-04-11 16:26:15,720 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-474150866- (storage id DS-317990214- service to localhost/ Incompatible clusterIDs in /home/hadoop/workspace/hadoop_space/hadoop23/dfs/data: namenode clusterID = CID-1745a89c-fb08-40f0-a14d-d37d01f199c3; datanode clusterID = CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
at org.apache.hadoop.hdfs.server.datanode.DataStorage .doTransition(
at org.apache.hadoop.hdfs.server.datanode.DataStorage .recoverTransitionRead(
at org.apache.hadoop.hdfs.server.datanode.DataStorage .recoverTransitionRead(
at itStorage(
at itBlockPool(
at org.apache.hadoop.hdfs.server.datanode.BPOfferServ ice.verifyAndSetNamespaceInfo( 280)
at org.apache.hadoop.hdfs.server.datanode.BPServiceAc tor.connectToNNAndHandshake( 2)
at org.apache.hadoop.hdfs.server.datanode.BPServiceAc
2013-04-11 16:26:16,212 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-474150866- (storage id DS-317990214- service to localhost/
2013-04-11 16:26:16,276 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-474150866- (storage id DS-317990214-
2013-04-11 16:26:18,396 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2013-04-11 16:26:18,940 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2013-04-11 16:26:19,668 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************** **********
SHUTDOWN_MSG: Shutting down DataNode at user-VirtualBox/
************************************************** **********/
Any ideas? May be I made a mistake during the installation process? But it is strange, that it worked once. I also have to say, that if I am logged in as my additional user to execute the commands ./ start namenode and the same with the datanode, I need to add sudo.
I used this installation guide:
By the way, I use the Oracle Java-7 version.
The problem could be that the namenode was formatted after the cluster was set up and the datanodes were not, so the slaves are still referring to the old namenode.
We have to delete and recreate the folder /home/hadoop/dfs/data on the local filesystem for the datanode.
Check your hdfs-site.xml file to see where is pointing to
and delete that folder
and then restart the datanode daemon on the machine
The steps above should recreate the folder and resolve the problem.
Please share your config info if the instructions above do not work.
DataNode dies because of incompatible Clusterids. To fix this problem
If you are using hadoop 2.X, then you have to delete everything in the folder that you have specified in hdfs-site.xml - "" (but NOT the folder itself).
The ClusterID will be maintained in that folder. Delete and restart This should work!!!
You need to delete both
C:\hadoop\data\dfs\datanode and
C:\hadoop\data\dfs\namenode folders.
If you don't have this folders - open your C:\hadoop\etc\hadoop\hdfs-site.xml file and get paths for this folders for next deletion. For me it says:
Run command for Format namenodec:\hadoop\bin>hdfs namenode -format
Now it should work!
I think the recommended way of doing this without deleting the data directory is to simply change the clusterID variable in the datanode's VERSION file.
If you look in your daemons directory, you will see the datanode directory exmaple
The VERSION file should look like this.
cat current/VERSION
#Tue Oct 14 17:31:58 CDT 2014
You need to change the clusterId to the first value in the output of the message so in your case that would be CID-1745a89c-fb08-40f0-a14d-d37d01f199c3 instead of CID-bb3547b0-03e4-4588-ac25-f0299ff81e4f
The updated version should appear like this with the altered clusterId
cat current/VERSION
#Tue Oct 14 17:31:58 CDT 2014
Restart hadoop and the datanode should start just fine.
