Running Spark on the slave node (YARN) doesn't work - hadoop

I can run SparkPi example on the master node, but when I try the same command
"spark-submit --class SparkPi --master yarn-client sparkpi.jar 10"
on the slave node, I got an error:
2015-05-19 14:05:44,881 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: maintainer
2015-05-19 14:05:44,886 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: maintainer
2015-05-19 14:05:44,887 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(maintainer); users with modify permissions: Set(maintainer)
2015-05-19 14:05:45,389 INFO [sparkDriver-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-05-19 14:05:45,443 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-05-19 14:05:45,641 INFO [sparkDriver-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver#slave2.com:33055]
2015-05-19 14:05:45,644 INFO [sparkDriver-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriver#slave2.com:33055]
2015-05-19 14:05:45,653 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 33055.
2015-05-19 14:05:45,674 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker
2015-05-19 14:05:45,688 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster
2015-05-19 14:05:45,707 INFO [main] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Created local directory at /tmp/spark-local-20150519140545-c81b
2015-05-19 14:05:45,712 INFO [main] storage.MemoryStore (Logging.scala:logInfo(59)) - MemoryStore started with capacity 265.4 MB
2015-05-19 14:05:46,205 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-05-19 14:05:46,408 INFO [main] spark.HttpFileServer (Logging.scala:logInfo(59)) - HTTP File server directory is /tmp/spark-e95a2b5b-efea-41eb-93b9-0a9f7d6f6701
2015-05-19 14:05:46,413 INFO [main] spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server
2015-05-19 14:05:46,477 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-05-19 14:05:46,499 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SocketConnector#0.0.0.0:52737
2015-05-19 14:05:46,500 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 52737.
2015-05-19 14:05:46,790 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-05-19 14:05:46,805 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:4040
2015-05-19 14:05:46,805 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040.
2015-05-19 14:05:46,808 INFO [main] ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://slave2.com:4040
2015-05-19 14:05:47,058 INFO [main] spark.SparkContext (Logging.scala:logInfo(59)) - Added JAR file:/home/maintainer/myjars/sparkpi.jar at http://[ip]:52737/jars/sparkpi.jar with timestamp 1432033547057
2015-05-19 14:05:47,190 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
2015-05-19 14:09:45,861 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
**2015-05-19 14:09:47,067 INFO [main] ipc.Client (Client.java:handleConnectionFailure(842)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-05-19 14:09:48,068 INFO [main] ipc.Client (Client.java:handleConnectionFailure(842)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
...**

Aside from specifying yarn.resourcemanager.hostname property in yarn-site.xml, it's also necessary to propagate configuration files to workers.
It might be done with this line (before running spark-submit):
export SPARK_YARN_DIST_FILES=$(ls $HADOOP_CONF_DIR* | sed 's#^#file://#g' | tr '\n' ',' | sed 's/,$//')
If everything's configured correctly, you'll see RM hostname instead of 0.0.0.0 in this line:
2015-05-19 14:05:47,190 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032

Exporting correct values for HADOOP_CONF_DIR fixed the issue.
export HADOOP_CONF_DIR=/your-path/hadoop/conf

Related

Trying to run statments on pig getting error

When i start to read a file on hdfs using pig in mapreduce mode, when i used dump b it started the mapreduce process and after completing it, it goes on to repetition please tell me whats the problem. (I have set the file permissions to 777 and /tmp permissions in hdfs to 777).
[root#master conf]# pig -x mapreduce
17/04/19 23:05:59 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
17/04/19 23:05:59 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
17/04/19 23:05:59 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2017-04-19 23:05:59,615 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0 (r1746530) compiled Jun 01 2016, 23:10:49
2017-04-19 23:05:59,615 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/hadoop/pig/conf/pig_1492623359614.log
2017-04-19 23:05:59,652 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2017-04-19 23:06:01,031 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost/
2017-04-19 23:06:02,136 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8021
2017-04-19 23:06:02,205 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-3df7c96f-9eac-4874-aab9-9ca7726fe860
2017-04-19 23:06:02,205 [main] WARN org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false
grunt> a= load '/temp' AS (name:chararray, age:int, salary:int);
grunt> b= foreach a generate (name, salary);
grunt> dump b;
2017-04-19 23:06:22,093 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2017-04-19 23:06:22,190 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2017-04-19 23:06:22,267 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2017-04-19 23:06:22,309 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for a: $1
2017-04-19 23:06:22,456 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (Tenured Gen) of size 699072512 to monitor. collectionUsageThreshold = 489350752, usageThreshold = 489350752
2017-04-19 23:06:22,564 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2017-04-19 23:06:22,589 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2017-04-19 23:06:22,589 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2017-04-19 23:06:22,724 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 23:06:23,128 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2017-04-19 23:06:23,152 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2017-04-19 23:06:23,154 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2017-04-19 23:06:23,820 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/pig-0.16.0-core-h2.jar to DistributedCache through /tmp/temp2091099620/tmp-1166978625/pig-0.16.0-core-h2.jar
2017-04-19 23:06:23,951 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp2091099620/tmp-1829507825/automaton-1.11-8.jar
2017-04-19 23:06:24,026 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp2091099620/tmp-1436552250/antlr-runtime-3.4.jar
2017-04-19 23:06:24,119 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/opt/hadoop/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp2091099620/tmp-1393102603/joda-time-2.9.3.jar
2017-04-19 23:06:24,132 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2017-04-19 23:06:24,148 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2017-04-19 23:06:24,148 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2017-04-19 23:06:24,148 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2017-04-19 23:06:24,279 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2017-04-19 23:06:24,302 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 23:06:24,920 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2017-04-19 23:06:24,952 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2017-04-19 23:06:24,995 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2017-04-19 23:06:24,995 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2017-04-19 23:06:25,056 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2017-04-19 23:06:25,375 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2017-04-19 23:06:25,889 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1492621692528_0002
2017-04-19 23:06:26,195 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2017-04-19 23:06:26,411 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1492621692528_0002
2017-04-19 23:06:26,537 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://master:8088/proxy/application_1492621692528_0002/
2017-04-19 23:06:26,537 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1492621692528_0002
2017-04-19 23:06:26,537 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a,b
2017-04-19 23:06:26,537 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,3],b[-1,-1] C: R:
2017-04-19 23:06:26,595 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2017-04-19 23:06:26,595 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1492621692528_0002]
2017-04-19 23:06:48,598 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2017-04-19 23:06:48,598 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1492621692528_0002]
2017-04-19 23:06:51,639 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2017-04-19 23:06:51,705 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-04-19 23:06:52,983 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:53,985 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:54,989 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:55,993 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:56,994 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:57,995 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:06:58,999 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:07:00,001 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-04-19 23:07:01,005 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
[2]+ Stopped pig -x mapreduce
Start the JobHistoryServer
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
Pig when ran in mapreduce mode expects the JobHistoryServer to be available.

Spark app can run in standalone mode but can't run in yarn cluster

En, hello everyone a quest troubled me a long time. I can run my spark app in standalone mode by this command
spark-submit --master spark://fuxiuyin-virtual-machine:7077 test_app.py
But this app fail to run in yarn cluster by this command
spark-submit --master yarn test_app.py
I think my yarn cluster is healthy.
The output of jps is
$ jps
8289 Worker
14882 NameNode
15475 ResourceManager
8134 Master
15751 NodeManager
15063 DataNode
17212 Jps
15295 SecondaryNameNode
And the 'Nodes of the cluster' page is
here
The output of spark-submit is
$ /opt/spark/bin/spark-submit --master yarn test_app.py
16/10/28 16:54:39 INFO spark.SparkContext: Running Spark version 2.0.1
16/10/28 16:54:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/28 16:54:39 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/28 16:54:39 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/28 16:54:39 INFO spark.SecurityManager: Changing view acls groups to:
16/10/28 16:54:39 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/28 16:54:39 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/28 16:54:39 INFO util.Utils: Successfully started service 'sparkDriver' on port 42519.
16/10/28 16:54:39 INFO spark.SparkEnv: Registering MapOutputTracker
16/10/28 16:54:39 INFO spark.SparkEnv: Registering BlockManagerMaster
16/10/28 16:54:39 INFO storage.DiskBlockManager: Created local directory at /opt/spark/blockmgr-1dcd1d1a-4cf4-4778-9b71-53e238a62c97
16/10/28 16:54:39 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
16/10/28 16:54:40 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/10/28 16:54:40 INFO util.log: Logging initialized #1843ms
16/10/28 16:54:40 INFO server.Server: jetty-9.2.z-SNAPSHOT
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#1b933891{/jobs,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#580d9060{/jobs/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3a8fb3d9{/jobs/job,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#744ecb1b{/jobs/job/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#761b32b3{/stages,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#42213280{/stages/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#5775066{/stages/stage,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7e355c0{/stages/stage/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#28426125{/stages/pool,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#63bcf39f{/stages/pool/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#5cf77bee{/storage,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#412768e5{/storage/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7ad772ad{/storage/rdd,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7ef35663{/storage/rdd/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#193c7a58{/environment,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#63a649da{/environment/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#22251d19{/executors,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#46810770{/executors/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3c155b42{/executors/threadDump,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#6dac2d83{/executors/threadDump/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#67eb38fa{/static,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#291f19f0{/,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3f4688da{/api,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#338a7a84{/stages/stage/kill,null,AVAILABLE}
16/10/28 16:54:40 INFO server.ServerConnector: Started ServerConnector#7df0e73{HTTP/1.1}{fuxiuyin-virtual-machine:4040}
16/10/28 16:54:40 INFO server.Server: Started #1962ms
16/10/28 16:54:40 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/10/28 16:54:40 INFO ui.SparkUI: Bound SparkUI to fuxiuyin-virtual-machine, and started at http://192.168.102.133:4040
16/10/28 16:54:40 INFO client.RMProxy: Connecting to ResourceManager at fuxiuyin-virtual-machine/192.168.102.133:8032
16/10/28 16:54:41 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/10/28 16:54:41 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/10/28 16:54:41 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/10/28 16:54:41 INFO yarn.Client: Setting up container launch context for our AM
16/10/28 16:54:41 INFO yarn.Client: Setting up the launch environment for our AM container
16/10/28 16:54:41 INFO yarn.Client: Preparing resources for our AM container
16/10/28 16:54:41 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/10/28 16:54:42 INFO yarn.Client: Uploading resource file:/opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/__spark_libs__697818607740390689.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/__spark_libs__697818607740390689.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/python/lib/pyspark.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/pyspark.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/python/lib/py4j-0.10.3-src.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/py4j-0.10.3-src.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/__spark_conf__7760765070208746118.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/__spark_conf__.zip
16/10/28 16:54:45 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/28 16:54:45 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/28 16:54:45 INFO spark.SecurityManager: Changing view acls groups to:
16/10/28 16:54:45 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/28 16:54:45 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/28 16:54:45 INFO yarn.Client: Submitting application application_1477644823180_0001 to ResourceManager
16/10/28 16:54:45 INFO impl.YarnClientImpl: Submitted application application_1477644823180_0001
16/10/28 16:54:45 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1477644823180_0001 and attemptId None
16/10/28 16:54:46 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:46 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1477644885891
final status: UNDEFINED
tracking URL: http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001/
user: fuxiuyin
16/10/28 16:54:47 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:48 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:49 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:50 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:51 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:52 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/10/28 16:54:52 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> fuxiuyin-virtual-machine, PROXY_URI_BASES -> http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001), /proxy/application_1477644823180_0001
16/10/28 16:54:52 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/10/28 16:54:52 INFO yarn.Client: Application report for application_1477644823180_0001 (state: RUNNING)
16/10/28 16:54:52 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.102.133
ApplicationMaster RPC port: 0
queue: default
start time: 1477644885891
final status: UNDEFINED
tracking URL: http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001/
user: fuxiuyin
16/10/28 16:54:52 INFO cluster.YarnClientSchedulerBackend: Application application_1477644823180_0001 has started running.
16/10/28 16:54:52 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39951.
16/10/28 16:54:52 INFO netty.NettyBlockTransferService: Server created on 192.168.102.133:39951
16/10/28 16:54:53 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.102.133:39951 with 366.3 MB RAM, BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#43ba5458{/metrics/json,null,AVAILABLE}
16/10/28 16:54:57 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/10/28 16:54:57 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> fuxiuyin-virtual-machine, PROXY_URI_BASES -> http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001), /proxy/application_1477644823180_0001
16/10/28 16:54:57 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/10/28 16:54:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.102.133:45708) with ID 1
16/10/28 16:54:59 INFO storage.BlockManagerMasterEndpoint: Registering block manager fuxiuyin-virtual-machine:33074 with 366.3 MB RAM, BlockManagerId(1, fuxiuyin-virtual-machine, 33074)
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.102.133:45712) with ID 2
16/10/28 16:55:00 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/10/28 16:55:00 INFO storage.BlockManagerMasterEndpoint: Registering block manager fuxiuyin-virtual-machine:43740 with 366.3 MB RAM, BlockManagerId(2, fuxiuyin-virtual-machine, 43740)
16/10/28 16:55:00 INFO spark.SparkContext: Starting job: collect at /home/fuxiuyin/test_app.py:8
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Got job 0 (collect at /home/fuxiuyin/test_app.py:8) with 2 output partitions
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (collect at /home/fuxiuyin/test_app.py:8)
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Missing parents: List()
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at collect at /home/fuxiuyin/test_app.py:8), which has no missing parents
16/10/28 16:55:00 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.8 KB, free 366.3 MB)
16/10/28 16:55:00 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 366.3 MB)
16/10/28 16:55:00 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.102.133:39951 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:00 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (PythonRDD[1] at collect at /home/fuxiuyin/test_app.py:8)
16/10/28 16:55:00 INFO cluster.YarnScheduler: Adding task set 0.0 with 2 tasks
16/10/28 16:55:00 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, fuxiuyin-virtual-machine, partition 0, PROCESS_LOCAL, 5450 bytes)
16/10/28 16:55:00 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, fuxiuyin-virtual-machine, partition 1, PROCESS_LOCAL, 5469 bytes)
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: fuxiuyin-virtual-machine.
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: fuxiuyin-virtual-machine.
16/10/28 16:55:01 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/10/28 16:55:01 INFO server.ServerConnector: Stopped ServerConnector#7df0e73{HTTP/1.1}{fuxiuyin-virtual-machine:4040}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#338a7a84{/stages/stage/kill,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3f4688da{/api,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#291f19f0{/,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#67eb38fa{/static,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#6dac2d83{/executors/threadDump/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3c155b42{/executors/threadDump,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#46810770{/executors/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#22251d19{/executors,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#63a649da{/environment/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#193c7a58{/environment,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7ef35663{/storage/rdd/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7ad772ad{/storage/rdd,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#412768e5{/storage/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#5cf77bee{/storage,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#63bcf39f{/stages/pool/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#28426125{/stages/pool,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7e355c0{/stages/stage/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#5775066{/stages/stage,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#42213280{/stages/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#761b32b3{/stages,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#744ecb1b{/jobs/job/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3a8fb3d9{/jobs/job,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#580d9060{/jobs/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#1b933891{/jobs,null,UNAVAILABLE}
16/10/28 16:55:01 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.102.133:4040
16/10/28 16:55:01 INFO scheduler.DAGScheduler: Job 0 failed: collect at /home/fuxiuyin/test_app.py:8, took 0.383872 s
16/10/28 16:55:01 INFO scheduler.DAGScheduler: ResultStage 0 (collect at /home/fuxiuyin/test_app.py:8) failed in 0.233 s
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo#469337f1)
Traceback (most recent call last):
File "/home/fuxiuyin/test_app.py", line 8, in <module>
print(data.collect())
File "/opt/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 776, in collect
File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1477644901073,JobFailed(org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down))
: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:818)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:816)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:816)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1685)
at org.apache.spark.util.EventLoop.stop(EventLoop.scala:83)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1604)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1798)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1797)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1916)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1930)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:453)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
16/10/28 16:55:01 ERROR client.TransportClient: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/10/28 16:55:01 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/10/28 16:55:01 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
16/10/28 16:55:01 ERROR util.Utils: Uncaught exception in thread Yarn application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:508)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:93)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:151)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:455)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1605)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1798)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1797)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
Caused by: java.io.IOException: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/10/28 16:55:01 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/28 16:55:01 INFO storage.DiskBlockManager: Shutdown hook called
16/10/28 16:55:01 INFO util.ShutdownHookManager: Shutdown hook called
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/userFiles-f51df2cd-8ec0-4caa-862f-77db0cc72505
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/pyspark-5216f977-d3c3-495f-b91a-88fa2218696d
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898
16/10/28 16:55:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on fuxiuyin-virtual-machine:43740 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(2, fuxiuyin-virtual-machine, 43740),broadcast_0_piece0,StorageLevel(memory, 1 replicas),2517,0))
16/10/28 16:55:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on fuxiuyin-virtual-machine:33074 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(1, fuxiuyin-virtual-machine, 33074),broadcast_0_piece0,StorageLevel(memory, 1 replicas),2517,0))
16/10/28 16:55:01 INFO memory.MemoryStore: MemoryStore cleared
16/10/28 16:55:01 INFO storage.BlockManager: BlockManager stopped
And the log of yarn resourcemanager is in
yarn-fuxiuyin-resourcemanager-fuxiuyin-virtual-machine.log
I submit app by this user:
uid=1000(fuxiuyin) gid=1000(fuxiuyin) 组=1000(fuxiuyin),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),108(lpadmin),124(sambashare)
My test_app is
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName("test_app")
sc = SparkContext(conf=conf)
data = sc.parallelize([1, 2, 3])
data = data.map(lambda x: x + 1)
print(data.collect())
I don't how to fix it.
Thinks.
The driver has to collect all the data from worker nodes before printing so use the code below..
i think the error is due to
print(data.collect())
use
for x in data.collect():
print x
and use spark submit as:
spark-submit --master yarn deploy-mode cluster test_app.py
instead of spark-submit --master yarn test_app.py
try this command spark-submit --master yarn-client test_app.py

Could you give me any clue Why 'Cannot call methods on a stopped SparkContext'?

When I put the 'val lines = sc.textFile("hdfs:///input")' in yarn-client, 'Cannot call methods on a stopped SparkContext' error occur. I searched all day long for two days, but I don't know where is cause. "hdfs:///input" is right, because when I executed it in standalone mode, I worked well.
Could you give me a any idea of that?
I'm using spark 1.5.2, hadoop 2.7.2.
tarting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
192.168.111.203: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
192.168.111.202: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
[root#master spark-1.5.2-bin-hadoop2.6]# bin/spark-shell --master yarn-client
16/03/19 05:59:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 05:59:12 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:13 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:13 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:13 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:46780
16/03/19 05:59:13 INFO util.Utils: Successfully started service 'HTTP class server' on port 46780.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.2
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_73)
Type in expressions to have them evaluated.
Type :help for more information.
16/03/19 05:59:17 INFO spark.SparkContext: Running Spark version 1.5.2
16/03/19 05:59:17 WARN spark.SparkConf:
SPARK_JAVA_OPTS was detected (set to '-Dspark.driver.port=53411').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with conf/spark-defaults.conf to set defaults for an application
- ./spark-submit with --driver-java-options to set -X options for a driver
- spark.executor.extraJavaOptions to set -X options for executors
- SPARK_DAEMON_JAVA_OPTS to set java options for standalone daemons (master or worker)
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.executor.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.driver.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:18 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/03/19 05:59:18 INFO Remoting: Starting remoting
16/03/19 05:59:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.111.201:53411]
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'sparkDriver' on port 53411.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering MapOutputTracker
16/03/19 05:59:18 INFO spark.SparkEnv: Registering BlockManagerMaster
16/03/19 05:59:18 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-f70b1bb6-288b-4894-bb49-22d1fc3d8d89
16/03/19 05:59:18 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
16/03/19 05:59:18 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/httpd-fe0c46a2-1d87-4bc7-8b4f-adfc79cb762a
16/03/19 05:59:18 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:40258
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'HTTP file server' on port 40258.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SelectChannelConnector#0.0.0.0:4040
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/03/19 05:59:18 INFO ui.SparkUI: Started SparkUI at http://192.168.111.201:4040
16/03/19 05:59:19 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/03/19 05:59:19 INFO client.RMProxy: Connecting to ResourceManager at /192.168.111.201:8032
16/03/19 05:59:19 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/03/19 05:59:19 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/03/19 05:59:19 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/03/19 05:59:19 INFO yarn.Client: Setting up container launch context for our AM
16/03/19 05:59:19 INFO yarn.Client: Setting up the launch environment for our AM container
16/03/19 05:59:19 INFO yarn.Client: Preparing resources for our AM container
16/03/19 05:59:21 INFO yarn.Client: Uploading resource file:/opt/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/spark-assembly-1.5.2-hadoop2.6.0.jar
16/03/19 05:59:25 INFO yarn.Client: Uploading resource file:/tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/__spark_conf__2052137095112870542.zip -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/__spark_conf__2052137095112870542.zip
16/03/19 05:59:25 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:25 INFO yarn.Client: Submitting application 2 to ResourceManager
16/03/19 05:59:25 INFO impl.YarnClientImpl: Submitted application application_1458334003417_0002
16/03/19 05:59:26 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:26 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:27 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:28 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:29 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:30 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:31 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:32 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:33 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:34 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:35 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:46505/user/YarnAM#149895142])
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:35 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:46505] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 INFO yarn.Client: Application report for application_1458334003417_0002 (state: RUNNING)
16/03/19 05:59:35 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.111.203
ApplicationMaster RPC port: 0
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Application application_1458334003417_0002 has started running.
16/03/19 05:59:36 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42938.
16/03/19 05:59:36 INFO netty.NettyBlockTransferService: Server created on 42938
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/03/19 05:59:36 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.111.201:42938 with 534.5 MB RAM, BlockManagerId(driver, 192.168.111.201, 42938)
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Registered BlockManager
16/03/19 05:59:40 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:34633/user/YarnAM#-40449267])
16/03/19 05:59:40 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:40 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/03/19 05:59:41 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.111.201:4040
16/03/19 05:59:41 INFO scheduler.DAGScheduler: Stopping DAGScheduler
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Stopped
16/03/19 05:59:42 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/19 05:59:42 INFO storage.MemoryStore: MemoryStore cleared
16/03/19 05:59:42 INFO storage.BlockManager: BlockManager stopped
16/03/19 05:59:42 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/03/19 05:59:42 INFO spark.SparkContext: Successfully stopped SparkContext
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/03/19 05:59:49 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/03/19 05:59:49 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.
16/03/19 05:59:49 INFO hive.HiveContext: Initializing execution hive, version 1.2.1
16/03/19 05:59:49 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:49 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 05:59:50 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 05:59:50 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 05:59:50 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:51 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:53 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 05:59:56 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 05:59:57 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 05:59:57 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 05:59:58 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 05:59:58 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/e16dc45f-de41-4e69-9f73-c976cc3358c9_resources
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9/_tmp_space.db
16/03/19 05:59:58 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse
16/03/19 05:59:58 INFO hive.HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/03/19 05:59:58 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:59 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 06:00:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 06:00:00 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 06:00:00 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:01 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 06:00:04 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 06:00:04 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 06:00:05 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 06:00:05 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 06:00:05 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 06:00:06 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 06:00:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 06:00:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/b046e212-ccbd-4415-aec3-5b207f147fda_resources
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda/_tmp_space.db
16/03/19 06:00:06 INFO repl.SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.
scala> val lines = sc.textFile("hdfs:///input")
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
at org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:2063)
at org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:2076)
at org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:825)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
at $iwC$$iwC$$iwC.<init>(<console>:34)
at $iwC$$iwC.<init>(<console>:36)
at $iwC.<init>(<console>:38)
at <init>(<console>:40)
at .<init>(<console>:44)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I encountered this in my Spark Structured Streaming application when I forgot to include the following:
spark.streams.awaitAnyTermination()
Your YARN application exits immediately after it starts:
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
Then, SparkContext is closed, so any action on this context will throw the exception you see.
Check the "Application Master" logs (visible through YARN's UI) to see the cause for the failure. This could be a memory configuration issue, network issues (e.g. host unreachable) and more - the log on the driver side (which is what you pasted) won't tell you which one it is.

MapReduce job is failing with an error failed to write data

I'm trying to export data from teradata to hadoop. but my export query is failing by giving an error "Failed to write data".Please look at the Mapreduce and application logs below:
Log Type: syslog
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 4931
2016-03-08 22:47:07,414 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2016-03-08 22:47:07,499 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-03-08 22:47:07,499 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2016-03-08 22:47:07,509 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2016-03-08 22:47:07,510 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1457504560070_0004, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier#175b9425)
2016-03-08 22:47:07,556 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: RM_DELEGATION_TOKEN, Service: 39.7.48.2:8032,39.7.48.3:8032, Ident: (owner=hive, renewer=oozie mr token, realUser=oozie, issueDate=1457506410968, maxDate=1458111210968, sequenceNumber=908, masterKeyId=280)
2016-03-08 22:47:07,599 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2016-03-08 22:47:07,848 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data1/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data2/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data3/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data4/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data5/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data6/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data7/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data8/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data9/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data10/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data12/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004
2016-03-08 22:47:08,132 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2016-03-08 22:47:08,623 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2016-03-08 22:47:08,840 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: com.teradata.dynaload.hcatalog.mapper.TDInputFormat$TeradataInputSplit#2ece4966
2016-03-08 22:47:08,844 INFO [main] com.teradata.dynaload.hcatalog.mapper.TDInputFormat$TeradataRecordReader: recordreader class com.teradata.dynaload.hcatalog.mapper.TDInputFormat$TeradataRecordReaderinitialize time is: 1457506028844
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 300417020(1201668080)
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 1146
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 841167680
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 1201668096
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 300417020; length = 75104256
2016-03-08 22:47:09,515 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2016-03-08 22:47:09,518 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
2016-03-08 22:47:09,518 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2016-03-08 22:47:09,848 WARN [main] org.apache.hadoop.hive.conf.HiveConf: HiveConf of name hive.metastore.local does not exist
2016-03-08 22:47:09,914 INFO [main] hive.metastore: Trying to connect to metastore with URI thrift://apus2.labs.teradata.com:9083
2016-03-08 22:47:09,951 INFO [main] hive.metastore: Connected to metastore.
2016-03-08 22:47:10,407 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2016-03-08 22:47:10,452 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
2016-03-08 22:47:10,453 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.work.output.dir is deprecated. Instead, use mapreduce.task.output.dir
2016-03-08 22:47:10,453 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
2016-03-08 22:47:10,457 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
APPLICATION Master LOGS:
Log Type: stderr
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 240
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Log Type: stdout
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 0
Log Type: syslog
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 66959
Showing 4096 bytes of 66959 total. Click here for the full log.
ILED
2016-03-08 22:59:19,325 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event JOB_FAILED
2016-03-08 22:59:19,456 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://C423A:8020/user/hive/.staging/job_1457504560070_0004/job_1457504560070_0004_1.jhist to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist_tmp
2016-03-08 22:59:19,550 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist_tmp
2016-03-08 22:59:19,562 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://C423A:8020/user/hive/.staging/job_1457504560070_0004/job_1457504560070_0004_1_conf.xml to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml_tmp
2016-03-08 22:59:19,614 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml_tmp
2016-03-08 22:59:19,645 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004.summary_tmp to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004.summary
2016-03-08 22:59:19,654 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml_tmp to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml
2016-03-08 22:59:19,666 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist_tmp to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist
2016-03-08 22:59:19,666 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2016-03-08 22:59:19,671 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to Task failed task_1457504560070_0004_m_000004
Job failed as tasks failed. failedMaps:1 failedReduces:0
2016-03-08 22:59:19,672 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: History url is http://apus2.labs.teradata.com:19888/jobhistory/job/job_1457504560070_0004
2016-03-08 22:59:19,680 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Waiting for application to be successfully unregistered.
2016-03-08 22:59:20,682 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:7 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:7 ContRel:0 HostLocal:6 RackLocal:1
2016-03-08 22:59:20,684 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://C423A /user/hive/.staging/job_1457504560070_0004
2016-03-08 22:59:20,711 INFO [Thread-89] org.apache.hadoop.ipc.Server: Stopping server on 46067
2016-03-08 22:59:20,712 INFO [IPC Server listener on 46067] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 46067
2016-03-08 22:59:20,712 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-03-08 22:59:20,714 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted.
Plese help me in resolving the issue.
You must be using sqoop to bring data to hadoop. Please paste command you are running. For "Failed to write data" , there can be multiple issues. destination parent directory is not avialble, space is not there at cluster etc.Only command can give explanation.

Got exception "unread block data" when reading Hbase table to Spark(1.2.0.2.2.0.0-82) RDD using PySpark on Yarn-Client on HDP (2.2) plantform

I have a strange exception when reading Hbase (0.98.4.2.2.0.0) table to Spark (1.2.0.2.2.0.0-82) RDD using PySpark on Yarn-Client(2.6.0) on HDP(2.2) plantform:
2015-04-14 19:05:11,295 WARN [task-result-getter-0] scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.0 in stage 0.0 (TID 0, hadoop-node05.mathartsys.com): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I followed the Spark example Python code:(https://github.com/apache/spark/blob/master/examples/src/main/python/hbase_inputformat.py)
and my code is :
import sys
from pyspark import SparkContext
if __name__ == "__main__":
sc = SparkContext(appName="HBaseInputFormat")
conf = {"hbase.zookeeper.quorum": "hadoop-node01.mathartsys.com,hadoop-node02.mathartsys.com,hadoop-node03.mathartsys.com",
"hbase.mapreduce.inputtable": "test",
"hbase.cluster.distributed":"true",
"hbase.rootdir":"hdfs://hadoop-node01.mathartsys.com:8020/apps/hbase/data",
"hbase.zookeeper.property.clientPort":"2181",
"zookeeper.session.timeout":"30000",
"zookeeper.znode.parent":"/hbase-unsecure"}
keyConv = "org.apache.spark.examples.pythonconverters.ImmutableBytesWritableToStringConverter"
valueConv = "org.apache.spark.examples.pythonconverters.HBaseResultToStringConverter"
hbase_rdd = sc.newAPIHadoopRDD(
"org.apache.hadoop.hbase.mapreduce.TableInputFormat",
"org.apache.hadoop.hbase.io.ImmutableBytesWritable",
"org.apache.hadoop.hbase.client.Result",
keyConverter=keyConv,
valueConverter=valueConv,
conf=conf)
output = hbase_rdd.collect()
for (k, v) in output:
print (k, v)
sc.stop()
and submitted the job like this:
spark-submit --master yarn-client --driver-class-path /opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/*:/usr/hdp/current/hbase-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/* hbase_inputformat.py
My environment is:
Centos 6.5
HDP 2.2
Spark 1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041
Can give some suggestion to solve it?!
The full log is:
[root#hadoop-node03 hbase]# spark-submit --master yarn-client --driver-class-path /opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/*:/usr/hdp/current/hbase-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/* hbase_test2.py
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-examples-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2015-04-14 22:41:34,839 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: root
2015-04-14 22:41:34,846 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: root
2015-04-14 22:41:34,847 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
2015-04-14 22:41:35,459 INFO [sparkDriver-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-04-14 22:41:35,524 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-04-14 22:41:35,754 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver#hadoop-node03.mathartsys.com:44295]
2015-04-14 22:41:35,764 INFO [Thread-2] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 44295.
2015-04-14 22:41:35,790 INFO [Thread-2] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker
2015-04-14 22:41:35,806 INFO [Thread-2] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster
2015-04-14 22:41:35,826 INFO [Thread-2] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Created local directory at /tmp/spark-local-20150414224135-a290
2015-04-14 22:41:35,832 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - MemoryStore started with capacity 265.4 MB
2015-04-14 22:41:36,535 WARN [Thread-2] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-04-14 22:41:36,823 INFO [Thread-2] spark.HttpFileServer (Logging.scala:logInfo(59)) - HTTP File server directory is /tmp/spark-b963d482-e9be-476b-85b0-94ab6cd8076c
2015-04-14 22:41:36,830 INFO [Thread-2] spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server
2015-04-14 22:41:36,902 INFO [Thread-2] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-04-14 22:41:36,921 INFO [Thread-2] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SocketConnector#0.0.0.0:58608
2015-04-14 22:41:36,925 INFO [Thread-2] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 58608.
2015-04-14 22:41:37,054 INFO [Thread-2] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-04-14 22:41:37,069 INFO [Thread-2] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:4040
2015-04-14 22:41:37,070 INFO [Thread-2] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040.
2015-04-14 22:41:37,073 INFO [Thread-2] ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://hadoop-node03.mathartsys.com:4040
2015-04-14 22:41:38,034 INFO [Thread-2] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://hadoop-node02.mathartsys.com:8188/ws/v1/timeline/
2015-04-14 22:41:38,220 INFO [Thread-2] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at hadoop-node02.mathartsys.com/10.0.0.222:8050
2015-04-14 22:41:38,511 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Requesting a new application from cluster with 3 NodeManagers
2015-04-14 22:41:38,536 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Verifying our application has not requested more than the maximum memory capability of the cluster (15360 MB per container)
2015-04-14 22:41:38,537 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Will allocate AM container, with 896 MB memory including 384 MB overhead
2015-04-14 22:41:38,537 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Setting up container launch context for our AM
2015-04-14 22:41:38,544 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Preparing resources for our AM container
2015-04-14 22:41:39,125 WARN [Thread-2] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(116)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2015-04-14 22:41:39,207 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Uploading resource file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar -> hdfs://hadoop-node01.mathartsys.com:8020/user/root/.sparkStaging/application_1428915066363_0013/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar
2015-04-14 22:41:40,428 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Uploading resource file:/root/hbase/hbase_test2.py -> hdfs://hadoop-node01.mathartsys.com:8020/user/root/.sparkStaging/application_1428915066363_0013/hbase_test2.py
2015-04-14 22:41:40,511 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Setting up the launch environment for our AM container
2015-04-14 22:41:40,564 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: root
2015-04-14 22:41:40,564 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: root
2015-04-14 22:41:40,565 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
2015-04-14 22:41:40,568 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Submitting application 13 to ResourceManager
2015-04-14 22:41:40,609 INFO [Thread-2] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) - Submitted application application_1428915066363_0013
2015-04-14 22:41:41,615 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:41,621 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) -
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1429022500586
final status: UNDEFINED
tracking URL: http://hadoop-node02.mathartsys.com:8088/proxy/application_1428915066363_0013/
user: root
2015-04-14 22:41:42,624 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:43,627 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:44,631 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:45,635 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:46,278 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM#hadoop-node05.mathartsys.com:42992/user/YarnAM#708767775]
2015-04-14 22:41:46,284 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoop-node02.mathartsys.com, PROXY_URI_BASES -> http://hadoop-node02.mathartsys.com:8088/proxy/application_1428915066363_0013), /proxy/application_1428915066363_0013
2015-04-14 22:41:46,287 INFO [sparkDriver-akka.actor.default-dispatcher-4] ui.JettyUtils (Logging.scala:logInfo(59)) - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2015-04-14 22:41:46,638 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: RUNNING)
2015-04-14 22:41:46,639 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) -
client token: N/A
diagnostics: N/A
ApplicationMaster host: hadoop-node05.mathartsys.com
ApplicationMaster RPC port: 0
queue: default
start time: 1429022500586
final status: UNDEFINED
tracking URL: http://hadoop-node02.mathartsys.com:8088/proxy/application_1428915066363_0013/
user: root
2015-04-14 22:41:46,641 INFO [Thread-2] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Application application_1428915066363_0013 has started running.
2015-04-14 22:41:46,795 INFO [Thread-2] netty.NettyBlockTransferService (Logging.scala:logInfo(59)) - Server created on 56053
2015-04-14 22:41:46,797 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Trying to register BlockManager
2015-04-14 22:41:46,800 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager hadoop-node03.mathartsys.com:56053 with 265.4 MB RAM, BlockManagerId(<driver>, hadoop-node03.mathartsys.com, 56053)
2015-04-14 22:41:46,803 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Registered BlockManager
2015-04-14 22:41:55,529 INFO [sparkDriver-akka.actor.default-dispatcher-3] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Registered executor: Actor[akka.tcp://sparkExecutor#hadoop-node06.mathartsys.com:42500/user/Executor#-374031537] with ID 2
2015-04-14 22:41:55,560 INFO [sparkDriver-akka.actor.default-dispatcher-3] util.RackResolver (RackResolver.java:coreResolve(109)) - Resolved hadoop-node06.mathartsys.com to /default-rack
2015-04-14 22:41:55,653 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Registered executor: Actor[akka.tcp://sparkExecutor#hadoop-node04.mathartsys.com:54112/user/Executor#35135131] with ID 1
2015-04-14 22:41:55,655 INFO [sparkDriver-akka.actor.default-dispatcher-4] util.RackResolver (RackResolver.java:coreResolve(109)) - Resolved hadoop-node04.mathartsys.com to /default-rack
2015-04-14 22:41:55,690 INFO [Thread-2] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
2015-04-14 22:41:55,998 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(298340) called with curMem=0, maxMem=278302556
2015-04-14 22:41:56,001 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0 stored as values in memory (estimated size 291.3 KB, free 265.1 MB)
2015-04-14 22:41:56,160 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(44100) called with curMem=298340, maxMem=278302556
2015-04-14 22:41:56,161 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0_piece0 stored as bytes in memory (estimated size 43.1 KB, free 265.1 MB)
2015-04-14 22:41:56,163 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo (Logging.scala:logInfo(59)) - Added broadcast_0_piece0 in memory on hadoop-node03.mathartsys.com:56053 (size: 43.1 KB, free: 265.4 MB)
2015-04-14 22:41:56,164 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Updated info of block broadcast_0_piece0
2015-04-14 22:41:56,167 INFO [Thread-2] spark.DefaultExecutionContext (Logging.scala:logInfo(59)) - Created broadcast 0 from newAPIHadoopRDD at PythonRDD.scala:516
2015-04-14 22:41:56,204 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(298388) called with curMem=342440, maxMem=278302556
2015-04-14 22:41:56,205 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_1 stored as values in memory (estimated size 291.4 KB, free 264.8 MB)
2015-04-14 22:41:56,279 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(44100) called with curMem=640828, maxMem=278302556
2015-04-14 22:41:56,279 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_1_piece0 stored as bytes in memory (estimated size 43.1 KB, free 264.8 MB)
2015-04-14 22:41:56,281 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo (Logging.scala:logInfo(59)) - Added broadcast_1_piece0 in memory on hadoop-node03.mathartsys.com:56053 (size: 43.1 KB, free: 265.3 MB)
2015-04-14 22:41:56,281 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Updated info of block broadcast_1_piece0
2015-04-14 22:41:56,283 INFO [Thread-2] spark.DefaultExecutionContext (Logging.scala:logInfo(59)) - Created broadcast 1 from broadcast at PythonRDD.scala:497
2015-04-14 22:41:56,286 INFO [Thread-2] python.Converter (Logging.scala:logInfo(59)) - Loaded converter: org.apache.spark.examples.pythonconverters.ImmutableBytesWritableToStringConverter
2015-04-14 22:41:56,287 INFO [Thread-2] python.Converter (Logging.scala:logInfo(59)) - Loaded converter: org.apache.spark.examples.pythonconverters.HBaseResultToStringConverter
2015-04-14 22:41:56,400 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager hadoop-node06.mathartsys.com:39033 with 530.3 MB RAM, BlockManagerId(2, hadoop-node06.mathartsys.com, 39033)
2015-04-14 22:41:56,434 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager hadoop-node04.mathartsys.com:33968 with 530.3 MB RAM, BlockManagerId(1, hadoop-node04.mathartsys.com, 33968)
......
......
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:host.name=hadoop-node03.mathartsys.com
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.version=1.7.0_75
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.vendor=Oracle Corporation
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.class.path=:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-1.2.0.2.2.0.0-82-yarn-shuffle.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-examples-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/datanucleus-core-3.2.10.jar:/usr/hdp/current/hbase-client/lib/curator-framework-2.6.0.jar:/usr/hdp/current/hbase-client/lib/commons-math-2.1.jar:/usr/hdp/current/hbase-client/lib/zookeeper.jar:/usr/hdp/current/hbase-client/lib/commons-lang-2.6.jar:/usr/hdp/current/hbase-client/lib/commons-io-2.4.jar:/usr/hdp/current/hbase-client/lib/jersey-server-1.8.jar:/usr/hdp/current/hbase-client/lib/servlet-api-2.5.jar:/usr/hdp/current/hbase-client/lib/gson-2.2.4.jar:/usr/hdp/current/hbase-client/lib/jackson-mapper-asl-1.9.13.jar:/usr/hdp/current/hbase-client/lib/hbase-shell.jar:/usr/hdp/current/hbase-client/lib/api-asn1-api-1.0.0-M20.jar:/usr/hdp/current/hbase-client/lib/jasper-runtime-5.5.23.jar:/usr/hdp/current/hbase-client/lib/xercesImpl-2.9.1.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jsch-0.1.42.jar:/usr/hdp/current/hbase-client/lib/xml-apis-1.3.04.jar:/usr/hdp/current/hbase-client/lib/jetty-6.1.26.jar:/usr/hdp/current/hbase-client/lib/commons-httpclient-3.1.jar:/usr/hdp/current/hbase-client/lib/aopalliance-1.0.jar:/usr/hdp/current/hbase-client/lib/hbase-testing-util-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/hbase-it.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop-compat-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/commons-digester-1.8.jar:/usr/hdp/current/hbase-client/lib/servlet-api-2.5-6.1.14.jar:/usr/hdp/current/hbase-client/lib/hbase-server-0.98.4.2.2.0.0-2041-hadoop2-tests.jar:/usr/hdp/current/hbase-client/lib/hamcrest-core-1.3.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/slf4j-api-1.6.4.jar:/usr/hdp/current/hbase-client/lib/jersey-guice-1.9.jar:/usr/hdp/current/hbase-client/lib/commons-configuration-1.6.jar:/usr/hdp/current/hbase-client/lib/jetty-sslengine-6.1.26.jar:/usr/hdp/current/hbase-client/lib/commons-codec-1.7.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-common-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/commons-el-1.0.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop2-compat.jar:/usr/hdp/current/hbase-client/lib/eclipselink-2.5.2-M1.jar:/usr/hdp/current/hbase-client/lib/jamon-runtime-2.3.1.jar:/usr/hdp/current/hbase-client/lib/xmlenc-0.52.jar:/usr/hdp/current/hbase-client/lib/hbase-prefix-tree-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/curator-recipes-2.6.0.jar:/usr/hdp/current/hbase-client/lib/jersey-core-1.8.jar:/usr/hdp/current/hbase-client/lib/hbase-testing-util.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/hdp/current/hbase-client/lib/hbase-shell-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/commons-beanutils-1.7.0.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop-compat.jar:/usr/hdp/current/hbase-client/lib/leveldbjni-all-1.8.jar:/usr/hdp/current/hbase-client/lib/jasper-compiler-5.5.23.jar:/usr/hdp/current/hbase-client/lib/ojdbc6.jar:/usr/hdp/current/hbase-client/lib/commons-daemon-1.0.13.jar:/usr/hdp/current/hbase-client/lib/api-util-1.0.0-M20.jar:/usr/hdp/current/hbase-client/lib/protobuf-java-2.5.0.jar:/usr/hdp/current/hbase-client/lib/httpclient-4.2.5.jar:/usr/hdp/current/hbase-client/lib/htrace-core-2.04.jar:/usr/hdp/current/hbase-client/lib/jersey-client-1.9.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/guice-servlet-3.0.jar:/usr/hdp/current/hbase-client/lib/metrics-core-2.2.0.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.0.4.jar:/usr/hdp/current/hbase-client/lib/paranamer-2.3.jar:/usr/hdp/current/hbase-client/lib/jackson-core-2.2.3.jar:/usr/hdp/current/hbase-client/lib/commons-compress-1.4.1.jar:/usr/hdp/current/hbase-client/lib/jets3t-0.9.0.jar:/usr/hdp/current/hbase-client/lib/microsoft-windowsazure-storage-sdk-0.6.0.jar:/usr/hdp/current/hbase-client/lib/hbase-examples-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jettison-1.3.1.jar:/usr/hdp/current/hbase-client/lib/commons-math3-3.1.1.jar:/usr/hdp/current/hbase-client/lib/jaxb-api-2.2.2.jar:/usr/hdp/current/hbase-client/lib/javax.inject-1.jar:/usr/hdp/current/hbase-client/lib/findbugs-annotations-1.3.9-1.jar:/usr/hdp/current/hbase-client/lib/mysql-connector-java.jar:/usr/hdp/current/hbase-client/lib/hbase-server-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/hbase-common-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jaxb-impl-2.2.3-1.jar:/usr/hdp/current/hbase-client/lib/jackson-xc-1.9.13.jar:/usr/hdp/current/hbase-client/lib/curator-client-2.6.0.jar:/usr/hdp/current/hbase-client/lib/asm-3.1.jar:/usr/hdp/current/hbase-client/lib/jackson-jaxrs-1.9.13.jar:/usr/hdp/current/hbase-client/lib/hbase-thrift-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jackson-core-asl-1.9.13.jar:/usr/hdp/current/hbase-client/lib/commons-cli-1.2.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-cred-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/java-xmlbuilder-0.4.jar:/usr/hdp/current/hbase-client/lib/jsp-2.1-6.1.14.jar:/usr/hdp/current/hbase-client/lib/hbase-prefix-tree.jar:/usr/hdp/current/hbase-client/lib/commons-beanutils-core-1.8.0.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop2-compat-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/hbase-it-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/libthrift-0.9.0.jar:/usr/hdp/current/hbase-client/lib/commons-collections-3.2.1.jar:/usr/hdp/current/hbase-client/lib/jruby-complete-1.6.8.jar:/usr/hdp/current/hbase-client/lib/jetty-util-6.1.26.jar:/usr/hdp/current/hbase-client/lib/apacheds-i18n-2.0.0-M15.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-impl-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/log4j-1.2.17.jar:/usr/hdp/current/hbase-client/lib/jersey-json-1.8.jar:/usr/hdp/current/hbase-client/lib/hbase-examples.jar:/usr/hdp/current/hbase-client/lib/hbase-it-0.98.4.2.2.0.0-2041-hadoop2-tests.jar:/usr/hdp/current/hbase-client/lib/xz-1.0.jar:/usr/hdp/current/hbase-client/lib/jsr305-1.3.9.jar:/usr/hdp/current/hbase-client/lib/hbase-thrift.jar:/usr/hdp/current/hbase-client/lib/guice-3.0.jar:/usr/hdp/current/hbase-client/lib/netty-3.6.6.Final.jar:/usr/hdp/current/hbase-client/lib/hbase-common-0.98.4.2.2.0.0-2041-hadoop2-tests.jar:/usr/hdp/current/hbase-client/lib/high-scale-lib-1.1.1.jar:/usr/hdp/current/hbase-client/lib/avro-1.7.4.jar:/usr/hdp/current/hbase-client/lib/httpcore-4.1.3.jar:/usr/hdp/current/hbase-client/lib/commons-logging-1.1.1.jar:/usr/hdp/current/hbase-client/lib/hbase-client-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jsp-api-2.1-6.1.14.jar:/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/junit-4.11.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/ranger-hbase-plugin-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/commons-net-3.1.jar:/usr/hdp/current/hbase-client/lib/snappy-java-1.0.4.1.jar:/usr/hdp/current/hbase-client/lib/activation-1.1.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-audit-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.jar:/usr/hdp/current/hadoop-mapreduce-client/curator-framework-2.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/metrics-core-3.0.1.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-lang-2.6.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-io-2.4.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-common-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/servlet-api-2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/gson-2.2.4.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-sls.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-distcp-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-mapper-asl-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/api-asn1-api-1.0.0-M20.jar:/usr/hdp/current/hadoop-mapreduce-client/jasper-runtime-5.5.23.jar:/usr/hdp/current/hadoop-mapreduce-client/jsch-0.1.42.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-auth-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/asm-3.2.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-httpclient-3.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-openstack.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-databind-2.2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/jersey-core-1.9.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-ant-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/mockito-all-1.8.5.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-2.6.0.2.2.0.0-2041-tests.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-digester-1.8.jar:/usr/hdp/current/hadoop-mapreduce-client/joda-time-2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/hamcrest-core-1.3.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-datajoin.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-ant.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-configuration-1.6.jar:/usr/hdp/current/hadoop-mapreduce-client/jersey-json-1.9.jar:/usr/hdp/current/hadoop-mapreduce-client/jetty-6.1.26.hwx.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-auth.jar:/usr/hdp/current/hadoop-mapreduce-client/aws-java-sdk-1.7.4.jar:/usr/hdp/current/hadoop-mapreduce-client/jsp-api-2.1.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-el-1.0.jar:/usr/hdp/current/hadoop-mapreduce-client/xmlenc-0.52.jar:/usr/hdp/current/hadoop-mapreduce-client/stax-api-1.0-2.jar:/usr/hdp/current/hadoop-mapreduce-client/curator-recipes-2.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-aws.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-common.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar:/usr/hdp/current/hadoop-mapreduce-client/jetty-util-6.1.26.hwx.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-distcp.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-archives-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-aws-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-beanutils-1.7.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jasper-compiler-5.5.23.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/api-util-1.0.0-M20.jar:/usr/hdp/current/hadoop-mapreduce-client/protobuf-java-2.5.0.jar:/usr/hdp/current/hadoop-mapreduce-client/httpclient-4.2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-app.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-sls-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/htrace-core-3.0.4.jar:/usr/hdp/current/hadoop-mapreduce-client/paranamer-2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-core-2.2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-compress-1.4.1.jar:/usr/hdp/current/hadoop-mapreduce-client/jets3t-0.9.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-gridmix-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/microsoft-windowsazure-storage-sdk-0.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-math3-3.1.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-rumen-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jaxb-api-2.2.2.jar:/usr/hdp/current/hadoop-mapreduce-client/jettison-1.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jaxb-impl-2.2.3-1.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-xc-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/curator-client-2.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-jaxrs-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-core-asl-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/httpcore-4.2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/guava-11.0.2.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-cli-1.2.jar:/usr/hdp/current/hadoop-mapreduce-client/zookeeper-3.4.6.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-annotations-2.2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-datajoin-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jersey-server-1.9.jar:/usr/hdp/current/hadoop-mapreduce-client/java-xmlbuilder-0.4.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-beanutils-core-1.8.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-archives.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-collections-3.2.1.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-codec-1.4.jar:/usr/hdp/current/hadoop-mapreduce-client/apacheds-i18n-2.0.0-M15.jar:/usr/hdp/current/hadoop-mapreduce-client/log4j-1.2.17.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-extras-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.jar:/usr/hdp/current/hadoop-mapreduce-client/xz-1.0.jar:/usr/hdp/current/hadoop-mapreduce-client/jsr305-1.3.9.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-gridmix.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-app-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/netty-3.6.2.Final.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core.jar:/usr/hdp/current/hadoop-mapreduce-client/avro-1.7.4.jar:/usr/hdp/current/hadoop-mapreduce-client/junit-4.11.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-logging-1.1.3.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-extras.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-net-3.1.jar:/usr/hdp/current/hadoop-mapreduce-client/snappy-java-1.0.4.1.jar:/usr/hdp/current/hadoop-mapreduce-client/activation-1.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-rumen.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-openstack-2.6.0.2.2.0.0-2041.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/conf:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar:/etc/hadoop/conf:/etc/hadoop/conf
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.io.tmpdir=/tmp
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.compiler=<NA>
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.name=Linux
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.arch=amd64
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.version=2.6.32-504.8.1.el6.x86_64
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.name=root
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.home=/root
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.dir=/root/hbase
2015-04-14 22:41:56,441 INFO [Thread-2] zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=hadoop-node02.mathartsys.com:2181,hadoop-node01.mathartsys.com:2181,hadoop-node03.mathartsys.com:2181 sessionTimeout=30000 watcher=hconnection-0x560cb988, quorum=hadoop-node02.mathartsys.com:2181,hadoop-node01.mathartsys.com:2181,hadoop-node03.mathartsys.com:2181, baseZNode=/hbase-unsecure
2015-04-14 22:41:56,458 INFO [Thread-2] zookeeper.RecoverableZooKeeper (RecoverableZooKeeper.java:<init>(120)) - Process identifier=hconnection-0x560cb988 connecting to ZooKeeper ensemble=hadoop-node02.mathartsys.com:2181,hadoop-node01.mathartsys.com:2181,hadoop-node03.mathartsys.com:2181
2015-04-14 22:41:56,460 INFO [Thread-2-SendThread(hadoop-node02.mathartsys.com:2181)] zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(966)) - Opening socket connection to server hadoop-node02.mathartsys.com/10.0.0.222:2181. Will not attempt to authenticate using SASL (unknown error)
2015-04-14 22:41:56,461 INFO [Thread-2-SendThread(hadoop-node02.mathartsys.com:2181)] zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(849)) - Socket connection established to hadoop-node02.mathartsys.com/10.0.0.222:2181, initiating session
2015-04-14 22:41:56,491 INFO [Thread-2-SendThread(hadoop-node02.mathartsys.com:2181)] zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1207)) - Session establishment complete on server hadoop-node02.mathartsys.com/10.0.0.222:2181, sessionid = 0x24cb25197440023, negotiated timeout = 30000
2015-04-14 22:41:56,605 INFO [Thread-2] util.RegionSizeCalculator (RegionSizeCalculator.java:<init>(76)) - Calculating region sizes for table "test".
2015-04-14 22:41:56,984 WARN [Thread-2] mapreduce.TableInputFormatBase (TableInputFormatBase.java:getSplits(193)) - Cannot resolve the host name for hadoop-node05.mathartsys.com/10.0.0.225 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '225.0.0.10.in-addr.arpa'
2015-04-14 22:41:57,013 INFO [Thread-2] spark.DefaultExecutionContext (Logging.scala:logInfo(59)) - Starting job: first at SerDeUtil.scala:202
......
2015-04-14 22:41:57,107 INFO [sparkDriver-akka.actor.default-dispatcher-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.0 in stage 0.0 (TID 0, hadoop-node04.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,216 WARN [task-result-getter-0] scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.0 in stage 0.0 (TID 0, hadoop-node04.mathartsys.com): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-04-14 22:41:57,220 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.1 in stage 0.0 (TID 1, hadoop-node06.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,303 INFO [task-result-getter-1] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.1 in stage 0.0 (TID 1) on executor hadoop-node06.mathartsys.com: java.lang.IllegalStateException (unread block data) [duplicate 1]
2015-04-14 22:41:57,306 INFO [sparkDriver-akka.actor.default-dispatcher-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.2 in stage 0.0 (TID 2, hadoop-node04.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,327 INFO [task-result-getter-2] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.2 in stage 0.0 (TID 2) on executor hadoop-node04.mathartsys.com: java.lang.IllegalStateException (unread block data) [duplicate 2]
2015-04-14 22:41:57,330 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.3 in stage 0.0 (TID 3, hadoop-node06.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,347 INFO [task-result-getter-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.3 in stage 0.0 (TID 3) on executor hadoop-node06.mathartsys.com: java.lang.IllegalStateException (unread block data) [duplicate 3]
2015-04-14 22:41:57,348 ERROR [task-result-getter-3] scheduler.TaskSetManager (Logging.scala:logError(75)) - Task 0 in stage 0.0 failed 4 times; aborting job
2015-04-14 22:41:57,350 INFO [task-result-getter-3] cluster.YarnClientClusterScheduler (Logging.scala:logInfo(59)) - Removed TaskSet 0.0, whose tasks have all completed, from pool
2015-04-14 22:41:57,353 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientClusterScheduler (Logging.scala:logInfo(59)) - Cancelling stage 0
2015-04-14 22:41:57,357 INFO [Thread-2] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Job 0 failed: first at SerDeUtil.scala:202, took 0.343391 s
Traceback (most recent call last):
File "/root/hbase/hbase_test2.py", line 24, in <module>
conf=conf)
File "/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/python/pyspark/context.py", line 530, in newAPIHadoopRDD
jconf, batchSize)
File "/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, hadoop-node06.mathartsys.com): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-examples-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[root#hadoop-node03 hbase]#

Resources