Deserialisation error on worker in standalone Spark cluster - hadoop

I have a spark application that works fine on a standalone Spark cluster that runs ok when the Spark cluster runs on my laptop
(master and one worker), but fails when I try to run in on a standalone Spark cluster
deployed on EC2 (master and worker are on different machines).
The application structure goes in the following way:
There is a java process ('message processor') that runs on the same machine as
Spark master. When it starts, it submits itself to Spark master, then,
it listens on SQS and on each received message, it should run a spark job to process a file from S3, which address is configured in the message .
It looks like all this fails at the point where the Spark driver tries to send the job
to the Spark executer.
Below is the code from the 'message processor' that configures the SparkContext,
Then the Spark driver log, and then the Spark executor log.
The outputs of my code and some important points are marked in bold and
I've simplified the code and logs in some places for the sake of readability.
Would appreciate your help very much, because I've run out of ideas with this problem.
'message processor' code:
logger.info("Started Integration Hub SubmitDriver in test mode.");
SparkConf sparkConf = new SparkConf()
.setMaster(SPARK_MASTER_URI)
.setAppName(APPLICATION_NAME)
.setSparkHome(SPARK_LOCATION_ON_EC2_MACHINE);
sparkConf.setJars(JavaSparkContext.jarOfClass(this.getClass()));
// configure spark executor to use log4j properties located in the local spark conf dir
sparkConf.set("spark.executor.extraJavaOptions", "-XX:+UseConcMarkSweepGC -Dlog4j.configuration=log4j_integrationhub_sparkexecutor.properties");
sparkConf.set("spark.executor.memory", "1g");
sparkConf.set("spark.cores.max", "3");
// Spill shuffle to disk to avoid OutOfMemory, at cost of reduced performance
sparkConf.set("spark.shuffle.spill", "true");
logger.info("Connecting Spark");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
sc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", AWS_KEY);
sc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", AWS_SECRET);
logger.info("Spark connected");
Driver log:
2015-05-01 07:47:14 INFO ClassPathBeanDefinitionScanner:239 - JSR-330 'javax.inject.Named' annotation found and supported for component scanning
2015-05-01 07:47:14 INFO AnnotationConfigApplicationContext:510 - Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext#5540b23b: startup date [Fri May 01 07:47:14 UTC 2015]; root of context hierarchy
2015-05-01 07:47:14 INFO AutowiredAnnotationBeanPostProcessor:140 - JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
2015-05-01 07:47:14 INFO DefaultListableBeanFactory:596 - Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory#13f948e: defining beans [org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,integrationHubConfig,org.springframework.context.annotation.ConfigurationClassPostProcessor.importAwareProcessor,processorInlineDriver,s3Accessor,cdFetchUtil,httpUtil,cdPushUtil,submitDriver,databaseLogger,connectorUtil,totangoDataValidations,environmentConfig,sesUtil,processorExecutor,processorDriver]; root of factory hierarchy
2015-05-01 07:47:15 INFO SubmitDriver:69 - Started Integration Hub SubmitDriver in test mode.
2015-05-01 07:47:15 INFO SubmitDriver:101 - Connecting Spark
2015-05-01 07:47:15 INFO SparkContext:59 - Running Spark version 1.3.0
2015-05-01 07:47:16 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-05-01 07:47:16 INFO SecurityManager:59 - Changing view acls to: hadoop
2015-05-01 07:47:16 INFO SecurityManager:59 - Changing modify acls to: hadoop
2015-05-01 07:47:16 INFO SecurityManager:59 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
2015-05-01 07:47:18 INFO Slf4jLogger:80 - Slf4jLogger started
2015-05-01 07:47:18 INFO Remoting:74 - Starting remoting
2015-05-01 07:47:18 INFO Remoting:74 - Remoting started; listening on addresses :[akka.tcp://sparkDriver#sparkMasterIp:39176]
2015-05-01 07:47:18 INFO Utils:59 - Successfully started service 'sparkDriver' on port 39176.
2015-05-01 07:47:18 INFO SparkEnv:59 - Registering MapOutputTracker
2015-05-01 07:47:18 INFO SparkEnv:59 - Registering BlockManagerMaster
2015-05-01 07:47:18 INFO HttpFileServer:59 - HTTP File server directory is /tmp/spark-e4726219-5708-48c9-8377-c103ad1e7a75/httpd-fe68500f-01b1-4241-a3a2-3b4cf8394daf
2015-05-01 07:47:18 INFO HttpServer:59 - Starting HTTP Server
2015-05-01 07:47:19 INFO Server:272 - jetty-8.y.z-SNAPSHOT
2015-05-01 07:47:19 INFO AbstractConnector:338 - Started SocketConnector#0.0.0.0:47166
2015-05-01 07:47:19 INFO Utils:59 - Successfully started service 'HTTP file server' on port 47166.
2015-05-01 07:47:19 INFO SparkEnv:59 - Registering OutputCommitCoordinator
2015-05-01 07:47:24 INFO Server:272 - jetty-8.y.z-SNAPSHOT
2015-05-01 07:47:24 INFO AbstractConnector:338 - Started SelectChannelConnector#0.0.0.0:4040
2015-05-01 07:47:24 INFO Utils:59 - Successfully started service 'SparkUI' on port 4040.
2015-05-01 07:47:24 INFO SparkUI:59 - Started SparkUI at http://sparkMasterIp:4040
2015-05-01 07:47:24 INFO SparkContext:59 - Added JAR /rev/8fcc3a5/integhub_be/genconn/lib/genconn-8fcc3a5.jar at http://sparkMasterIp:47166/jars/genconn-8fcc3a5.jar with timestamp 1430466444838
2015-05-01 07:47:24 INFO AppClient$ClientActor:59 - Connecting to master akka.tcp://sparkMaster#sparkMasterIp:7077/user/Master...
2015-05-01 07:47:25 INFO AppClient$ClientActor:59 - Executor added: app-20150501074725-0005/0 on worker-20150430140019-ip-sparkWorkerIp-38610 (sparkWorkerIp:38610) with 1 cores
2015-05-01 07:47:25 INFO AppClient$ClientActor:59 - Executor updated: app-20150501074725-0005/0 is now LOADING
2015-05-01 07:47:25 INFO AppClient$ClientActor:59 - Executor updated: app-20150501074725-0005/0 is now RUNNING
2015-05-01 07:47:25 INFO NettyBlockTransferService:59 - Server created on 34024
2015-05-01 07:47:26 INFO SubmitDriver:116 - Spark connected
2015-05-01 07:47:26 INFO SubmitDriver:125 - Connected to SQS... Listening on https://sqsAddress
2015-05-01 07:51:39 INFO SubmitDriver:130 - Polling Message queue...
2015-05-01 07:51:47 INFO SubmitDriver:148 - Received Message : {someMessage}
2015-05-01 07:51:47 INFO SubmitDriver:158 - Process Input JSON
2015-05-01 07:51:50 INFO SparkContext:59 - Created broadcast 0 from textFile at ProcessorDriver.java:208
2015-05-01 07:51:52 INFO FileInputFormat:253 - Total input paths to process : 1
2015-05-01 07:51:52 INFO SparkContext:59 - Starting job: first at ConnectorUtil.java:605
2015-05-01 07:51:52 INFO SparkContext:59 - Created broadcast 1 from broadcast at DAGScheduler.scala:839
2015-05-01 07:51:52 WARN TaskSetManager:71 - ... *the warning will be repeated as error below*
2015-05-01 07:51:52 ERROR TaskSetManager:75 - Task 0 in stage 0.0 failed 4 times; aborting job
2015-05-01 07:51:52 ERROR ProcessorDriver:261 - Error executing the batch Operation..
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, sparkWorkerIp): java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2744)
at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1032)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
at org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:43)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1137)
at org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace: ...
Worker log:
2015-05-01 07:47:26 INFO CoarseGrainedExecutorBackend:47 - Registered signal handlers for [TERM, HUP, INT]
2015-05-01 07:47:26 DEBUG Configuration:227 - java.io.IOException: config()
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214)
at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:78)
at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:43)
at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:220)
at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
2015-05-01 07:47:26 DEBUG Groups:139 - Creating new Groups object
2015-05-01 07:47:27 DEBUG Groups:59 - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
2015-05-01 07:47:27 DEBUG Configuration:227 - java.io.IOException: config()
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:184)
at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:236)
at org.apache.hadoop.security.KerberosName.<clinit>(KerberosName.java:79)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:209)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:226)
at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:44)
at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:220)
at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:128)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:224)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
2015-05-01 07:47:27 DEBUG SparkHadoopUtil:63 - running as user: hadoop
2015-05-01 07:47:27 DEBUG UserGroupInformation:146 - hadoop login
2015-05-01 07:47:27 DEBUG UserGroupInformation:95 - hadoop login commit
2015-05-01 07:47:27 DEBUG UserGroupInformation:125 - using local user:UnixPrincipal: root
2015-05-01 07:47:27 DEBUG UserGroupInformation:493 - UGI loginUser:root
2015-05-01 07:47:27 DEBUG UserGroupInformation:1143 - PriviledgedAction as:hadoop from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
2015-05-01 07:47:27 INFO SecurityManager:59 - Changing view acls to: root,hadoop
2015-05-01 07:47:27 INFO SecurityManager:59 - Changing modify acls to: root,hadoop
2015-05-01 07:47:27 INFO SecurityManager:59 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, hadoop); users with modify permissions: Set(root, hadoop)
2015-05-01 07:47:27 DEBUG SecurityManager:63 - SSLConfiguration for file server: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
2015-05-01 07:47:27 DEBUG SecurityManager:63 - SSLConfiguration for Akka: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
2015-05-01 07:47:27 DEBUG AkkaUtils:63 - In createActorSystem, requireCookie is: off
2015-05-01 07:47:28 INFO Slf4jLogger:80 - Slf4jLogger started
2015-05-01 07:47:28 INFO Remoting:74 - Starting remoting
2015-05-01 07:47:29 INFO Remoting:74 - Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher#sparkWorkerIp:49741]
2015-05-01 07:47:29 INFO Utils:59 - Successfully started service 'driverPropsFetcher' on port 49741.
2015-05-01 07:47:29 INFO RemoteActorRefProvider$RemotingTerminator:74 - Shutting down remote daemon.
2015-05-01 07:47:29 INFO RemoteActorRefProvider$RemotingTerminator:74 - Remote daemon shut down; proceeding with flushing remote transports.
2015-05-01 07:47:29 INFO SecurityManager:59 - Changing view acls to: root,hadoop
2015-05-01 07:47:29 INFO SecurityManager:59 - Changing modify acls to: root,hadoop
2015-05-01 07:47:29 INFO SecurityManager:59 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, hadoop); users with modify permissions: Set(root, hadoop)
2015-05-01 07:47:29 DEBUG SecurityManager:63 - SSLConfiguration for file server: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
2015-05-01 07:47:29 DEBUG SecurityManager:63 - SSLConfiguration for Akka: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
2015-05-01 07:47:29 DEBUG AkkaUtils:63 - In createActorSystem, requireCookie is: off
2015-05-01 07:47:29 INFO RemoteActorRefProvider$RemotingTerminator:74 - Remoting shut down.
2015-05-01 07:47:29 INFO Slf4jLogger:80 - Slf4jLogger started
2015-05-01 07:47:29 INFO Remoting:74 - Starting remoting
2015-05-01 07:47:29 INFO Remoting:74 - Remoting started; listening on addresses :[akka.tcp://sparkExecutor# sparkWorkerIp:45299]
2015-05-01 07:47:29 INFO Utils:59 - Successfully started service 'sparkExecutor' on port 45299.
2015-05-01 07:47:29 DEBUG SparkEnv:63 - Using serializer: class org.apache.spark.serializer.JavaSerializer
2015-05-01 07:47:29 INFO AkkaUtils:59 - Connecting to MapOutputTracker: akka.tcp://sparkDriver# sparkMasterIp:39176/user/MapOutputTracker
2015-05-01 07:47:30 INFO AkkaUtils:59 - Connecting to BlockManagerMaster: akka.tcp://sparkDriver#sparkMasterIp:39176/user/BlockManagerMaster
2015-05-01 07:47:30 INFO DiskBlockManager:59 - Created local directory at /mnt/spark/spark-d745cbac-d1cc-47ee-9eba-e99e104732d5/spark-e3963fa3-cab6-4c69-8e78-d23246250a5d/spark-6f1a9653-86fd-401f-bf37-6eca5b6c0adf/blockmgr-ee0e9452-4111-42d0-ab5e-e66317052e4b
2015-05-01 07:47:30 INFO MemoryStore:59 - MemoryStore started with capacity 548.5 MB
2015-05-01 07:47:30 INFO AkkaUtils:59 - Connecting to OutputCommitCoordinator: akka.tcp://sparkDriver# sparkMasterIp:39176/user/OutputCommitCoordinator
2015-05-01 07:47:30 INFO CoarseGrainedExecutorBackend:59 - Connecting to driver: akka.tcp://sparkDriver# sparkMasterIp:39176/user/CoarseGrainedScheduler
2015-05-01 07:47:30 INFO WorkerWatcher:59 - Connecting to worker akka.tcp://sparkWorker#sparkWorkerIp:38610/user/Worker
2015-05-01 07:47:30 DEBUG WorkerWatcher:50 - [actor] received message Associated [akka.tcp://sparkExecutor# sparkWorkerIp:45299] -> [akka.tcp://sparkWorker# sparkWorkerIp:38610] from Actor[akka://sparkExecutor/deadLetters]
2015-05-01 07:47:30 INFO WorkerWatcher:59 - Successfully connected to akka.tcp://sparkWorker# sparkWorkerIp:38610/user/Worker
2015-05-01 07:47:30 DEBUG WorkerWatcher:56 - [actor] handled message (1.18794 ms) Associated [akka.tcp://sparkExecutor# sparkWorkerIp:45299] -> [akka.tcp://sparkWorker# sparkWorkerIp:38610] from Actor[akka://sparkExecutor/deadLetters]
2015-05-01 07:47:30 DEBUG CoarseGrainedExecutorBackend:50 - [actor] received message RegisteredExecutor from Actor[akka.tcp://sparkDriver# sparkMasterIp:39176/user/CoarseGrainedScheduler#-970636338]
2015-05-01 07:47:30 INFO CoarseGrainedExecutorBackend:59 - Successfully registered with driver
2015-05-01 07:47:30 INFO Executor:59 - Starting executor ID 0 on host sparkWorkerIp
2015-05-01 07:47:30 DEBUG InternalLoggerFactory:71 - Using SLF4J as the default logging framework
2015-05-01 07:47:30 DEBUG PlatformDependent0:76 - java.nio.Buffer.address: available
2015-05-01 07:47:30 DEBUG PlatformDependent0:76 - sun.misc.Unsafe.theUnsafe: available
2015-05-01 07:47:30 DEBUG PlatformDependent0:71 - sun.misc.Unsafe.copyMemory: available
2015-05-01 07:47:30 DEBUG PlatformDependent0:76 - java.nio.Bits.unaligned: true
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - UID: 0
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - Java version: 7
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - -Dio.netty.noUnsafe: false
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - sun.misc.Unsafe: available
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - -Dio.netty.noJavassist: false
2015-05-01 07:47:30 DEBUG PlatformDependent:71 - Javassist: unavailable
2015-05-01 07:47:30 DEBUG PlatformDependent:71 - You don't have Javassist in your class path or you don't have enough permission to load dynamically generated classes. Please check the configuration for better performance.
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - -Dio.netty.tmpdir: /tmp (java.io.tmpdir)
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - -Dio.netty.bitMode: 64 (sun.arch.data.model)
2015-05-01 07:47:30 DEBUG PlatformDependent:76 - -Dio.netty.noPreferDirect: false
2015-05-01 07:47:30 DEBUG MultithreadEventLoopGroup:76 - -Dio.netty.eventLoopThreads: 2
2015-05-01 07:47:30 DEBUG NioEventLoop:76 - -Dio.netty.noKeySetOptimization: false
2015-05-01 07:47:30 DEBUG NioEventLoop:76 - -Dio.netty.selectorAutoRebuildThreshold: 512
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.numHeapArenas: 1
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.numDirectArenas: 1
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.pageSize: 8192
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.maxOrder: 11
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.chunkSize: 16777216
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.tinyCacheSize: 512
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.smallCacheSize: 256
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.normalCacheSize: 64
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.maxCachedBufferCapacity: 32768
2015-05-01 07:47:30 DEBUG PooledByteBufAllocator:76 - -Dio.netty.allocator.cacheTrimInterval: 8192
2015-05-01 07:47:30 DEBUG ThreadLocalRandom:71 - -Dio.netty.initialSeedUniquifier: 0x4ac460da6a283b82 (took 1 ms)
2015-05-01 07:47:31 DEBUG ByteBufUtil:76 - -Dio.netty.allocator.type: unpooled
2015-05-01 07:47:31 DEBUG ByteBufUtil:76 - -Dio.netty.threadLocalDirectBufferSize: 65536
2015-05-01 07:47:31 DEBUG NetUtil:86 - Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%1)
2015-05-01 07:47:31 DEBUG NetUtil:81 - /proc/sys/net/core/somaxconn: 128
2015-05-01 07:47:31 DEBUG TransportServer:106 - Shuffle server started on port :46839
2015-05-01 07:47:31 INFO NettyBlockTransferService:59 - Server created on 46839
2015-05-01 07:47:31 INFO BlockManagerMaster:59 - Trying to register BlockManager
2015-05-01 07:47:31 INFO BlockManagerMaster:59 - Registered BlockManager
2015-05-01 07:47:31 INFO AkkaUtils:59 - Connecting to HeartbeatReceiver: akka.tcp://sparkDriver# sparkMasterIp:39176/user/HeartbeatReceiver
2015-05-01 07:47:31 DEBUG CoarseGrainedExecutorBackend:56 - [actor] handled message (339.232401 ms) RegisteredExecutor from Actor[akka.tcp://sparkDriver# sparkMasterIp:39176/user/CoarseGrainedScheduler#-970636338]
2015-05-01 07:51:52 DEBUG CoarseGrainedExecutorBackend:50 - [actor] received message LaunchTask(org.apache.spark.util.SerializableBuffer#608752bf) from Actor[akka.tcp://sparkDriver# sparkMasterIp:39176/user/CoarseGrainedScheduler#-970636338]
2015-05-01 07:51:52 INFO CoarseGrainedExecutorBackend:59 - Got assigned task 0
2015-05-01 07:51:52 DEBUG CoarseGrainedExecutorBackend:56 - [actor] handled message (22.96474 ms) LaunchTask(org.apache.spark.util.SerializableBuffer#608752bf) from Actor[akka.tcp://sparkDriver# sparkMasterIp:39176/user/CoarseGrainedScheduler#-970636338]
2015-05-01 07:51:52 INFO Executor:59 - Running task 0.0 in stage 0.0 (TID 0)
2015-05-01 07:51:52 INFO Executor:59 - Fetching http://sparkMasterIp:47166/jars/genconn-8fcc3a5.jar with timestamp 1430466444838
2015-05-01 07:51:52 DEBUG Configuration:227 - java.io.IOException: config()
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214)
at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:78)
at org.apache.spark.executor.Executor.hadoopConf$lzycompute$1(Executor.scala:356)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$hadoopConf$1(Executor.scala:356)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:375)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:366)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:366)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:184)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-05-01 07:51:52 DEBUG Utils:63 - fetchFile not using security
2015-05-01 07:51:52 INFO Utils:59 - Fetching http://sparkMasterIp:47166/jars/genconn-8fcc3a5.jar to /mnt/spark/spark-d745cbac-d1cc-47ee-9eba-e99e104732d5/spark-e3963fa3-cab6-4c69-8e78-d23246250a5d/spark-0eabace1-ee89-48a3-9a71-0218f0ffc61c/fetchFileTemp2001054150131059247.tmp
2015-05-01 07:51:52 INFO Utils:59 - Copying /mnt/spark/spark-d745cbac-d1cc-47ee-9eba-e99e104732d5/spark-e3963fa3-cab6-4c69-8e78-d23246250a5d/spark-0eabace1-ee89-48a3-9a71-0218f0ffc61c/18615094621430466444838_cache to /mnt/spark-work/app-20150501074725-0005/0/./genconn-8fcc3a5.jar
2015-05-01 07:51:52 INFO Executor:59 - Adding file:/mnt/spark-work/app-20150501074725-0005/0/./genconn-8fcc3a5.jar to class loader
2015-05-01 07:51:52 DEBUG Configuration:227 - java.io.IOException: config()
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:227)
at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:214)
at org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:42)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1137)
at org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-05-01 07:51:52 ERROR Executor:96 - Exception in task 0.0 in stage 0.0 (TID 0)
*the error that is printed in the driver log*
...

Related

HDFS NameNode startup very slow with few blocks

I have a fairly small setup (HDP 2.6) with roughly 1429 blocks on a 15 TB HDD. The system has 512 GB RAM and 128 cores (256 threads).
Over last few days, I've seen the startup of entire HDP setup go from about 10 minutes to several hours. The culprit turned out to be the NameNode.When the box was first setup without any data, the entire HDP + HCP setup would startup in about 10 minutes (including data and name nodes). We start testing with large volumes of data and over time our block went over 23 million. At this point the system took around 3 hours to start. This was mostly due to NameNode startup time, which seems understandable given the large number of blocks.
However, even after deleting all the folders/blocks and leaving behind just 1429 blocks, the system is still taking over 50 minutes to start name node and come out of Safe Mode automatically.
The DataNode logs pause after the Replica Cache line below and then start displaying "Detected pause in JVM or host machine (eg GC)".
************************************************************/
2019-10-29 00:30:01,711 INFO datanode.DataNode (LogAdapter.java:info(47)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: user = hdfs
STARTUP_MSG: host = xxxx.corp.com/scrambled.private.ip.address
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.7.3.2.6.5.1100-53
STARTUP_MSG: classpath = removed for brevity
STARTUP_MSG: build = git#github.com:hortonworks/hadoop.git -r 3091053c59a62c82d82c9f778c48bde5ef0a89a1; compiled by 'jenkins' on 2019-03-13T15:40Z
STARTUP_MSG: java = 1.8.0_112
************************************************************/
2019-10-29 00:30:02,253 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(122)) - Scheduling a check for [DISK]file:/hadoop/hdfs/data/
2019-10-29 00:30:04,189 INFO datanode.BlockScanner (BlockScanner.java:<init>(180)) - Initialized block scanner with targetBytesPerSec 1048576
2019-10-29 00:30:04,193 INFO common.Util (Util.java:isDiskStatsEnabled(111)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2019-10-29 00:30:04,197 INFO datanode.DataNode (DataNode.java:<init>(444)) - File descriptor passing is enabled.
2019-10-29 00:30:04,197 INFO datanode.DataNode (DataNode.java:<init>(455)) - Configured hostname is xxxx.corp.com
2019-10-29 00:30:04,197 INFO common.Util (Util.java:isDiskStatsEnabled(111)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2019-10-29 00:30:04,198 WARN conf.Configuration (Configuration.java:getTimeDurationHelper(1659)) - No unit for dfs.datanode.outliers.report.interval(1800000) assuming MILLISECONDS
2019-10-29 00:30:04,204 INFO datanode.DataNode (DataNode.java:startDataNode(1251)) - Starting DataNode with maxLockedMemory = 0
2019-10-29 00:30:04,221 INFO datanode.DataNode (DataNode.java:initDataXceiver(1028)) - Opened streaming server at /0.0.0.0:50010
2019-10-29 00:30:04,223 INFO datanode.DataNode (DataXceiverServer.java:<init>(78)) - Balancing bandwith is 6250000 bytes/s
2019-10-29 00:30:04,223 INFO datanode.DataNode (DataXceiverServer.java:<init>(79)) - Number threads for balancing is 5
2019-10-29 00:30:04,225 INFO datanode.DataNode (DataXceiverServer.java:<init>(78)) - Balancing bandwith is 6250000 bytes/s
2019-10-29 00:30:04,225 INFO datanode.DataNode (DataXceiverServer.java:<init>(79)) - Number threads for balancing is 5
2019-10-29 00:30:04,226 INFO datanode.DataNode (DataNode.java:initDataXceiver(1043)) - Listening on UNIX domain socket: /var/lib/hadoop-hdfs/dn_socket
2019-10-29 00:30:04,296 INFO mortbay.log (Slf4jLog.java:info(67)) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2019-10-29 00:30:04,304 INFO server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(296)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2019-10-29 00:30:04,308 INFO http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.datanode is not defined
2019-10-29 00:30:04,313 INFO http.HttpServer2 (HttpServer2.java:addGlobalFilter(788)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2019-10-29 00:30:04,315 INFO http.HttpServer2 (HttpServer2.java:addFilter(763)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode
2019-10-29 00:30:04,337 INFO http.HttpServer2 (HttpServer2.java:bindListener(986)) - Jetty bound to port 43272
2019-10-29 00:30:04,338 INFO mortbay.log (Slf4jLog.java:info(67)) - jetty-6.1.26.hwx
2019-10-29 00:30:04,511 INFO mortbay.log (Slf4jLog.java:info(67)) - Started HttpServer2$SelectChannelConnectorWithSafeStartup#localhost:43272
2019-10-29 00:30:07,643 INFO web.DatanodeHttpServer (DatanodeHttpServer.java:start(233)) - Listening HTTP traffic on /0.0.0.0:50075
2019-10-29 00:30:07,647 INFO util.JvmPauseMonitor (JvmPauseMonitor.java:run(179)) - Starting JVM pause monitor
2019-10-29 00:30:08,366 INFO datanode.DataNode (DataNode.java:startDataNode(1277)) - dnUserName = hdfs
2019-10-29 00:30:08,366 INFO datanode.DataNode (DataNode.java:startDataNode(1278)) - supergroup = hdfs
2019-10-29 00:30:08,579 INFO ipc.CallQueueManager (CallQueueManager.java:<init>(75)) - Using callQueue: class java.util.concurrent.LinkedBlockingQueue scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2019-10-29 00:30:08,734 INFO ipc.Server (Server.java:run(821)) - Starting Socket Reader #1 for port 8010
2019-10-29 00:30:09,244 INFO datanode.DataNode (DataNode.java:initIpcServer(941)) - Opened IPC server at /0.0.0.0:8010
2019-10-29 00:30:09,258 INFO datanode.DataNode (BlockPoolManager.java:refreshNamenodes(152)) - Refresh request received for nameservices: null
2019-10-29 00:30:09,274 INFO datanode.DataNode (BlockPoolManager.java:doRefreshNamenodes(201)) - Starting BPOfferServices for nameservices: <default>
2019-10-29 00:30:09,430 INFO datanode.DataNode (BPServiceActor.java:run(761)) - Block pool <registering> (Datanode Uuid unassigned) service to xxxx.corp.com/scrambled.private.ip.address:8020 starting to offer service
2019-10-29 00:30:09,434 INFO ipc.Server (Server.java:run(1064)) - IPC Server Responder: starting
2019-10-29 00:30:09,434 INFO ipc.Server (Server.java:run(900)) - IPC Server listener on 8010: starting
2019-10-29 00:30:10,930 INFO common.Storage (DataStorage.java:getParallelVolumeLoadThreadsNum(384)) - Using 1 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=1, dataDirs=1)
2019-10-29 00:30:10,962 INFO common.Storage (Storage.java:tryLock(776)) - Lock on /hadoop/hdfs/data/in_use.lock acquired by nodename 210295#xxxx.corp.com
2019-10-29 00:30:11,121 INFO common.Storage (BlockPoolSliceStorage.java:recoverTransitionRead(250)) - Analyzing storage directories for bpid BP-814497463-127.0.0.1-1558792659773
2019-10-29 00:30:11,121 INFO common.Storage (Storage.java:lock(735)) - Locking is disabled for /hadoop/hdfs/data/current/BP-814497463-127.0.0.1-1558792659773
2019-10-29 00:30:11,139 INFO datanode.DataNode (DataNode.java:initStorage(1546)) - Setting up storage: nsid=875919329;bpid=BP-814497463-127.0.0.1-1558792659773;lv=-56;nsInfo=lv=-63;cid=CID-49b9105e-fc0d-4ea4-9d2f-caceb95ce4bb;nsid=875919329;c=0;bpid=BP-814497463-127.0.0.1-1558792659773;dnuuid=0aff4a22-3f1a-485b-9aec-46fd881dfab0
2019-10-29 00:30:11,523 INFO impl.FsDatasetImpl (FsVolumeList.java:addVolume(295)) - Added new volume: DS-ea7ed3be-90ad-4424-a00c-577601814d81
2019-10-29 00:30:11,523 INFO impl.FsDatasetImpl (FsDatasetImpl.java:addVolume(426)) - Added volume - /hadoop/hdfs/data/current, StorageType: DISK
2019-10-29 00:30:11,527 INFO impl.FsDatasetImpl (FsDatasetImpl.java:registerMBean(2203)) - Registered FSDatasetState MBean
2019-10-29 00:30:11,711 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(122)) - Scheduling a check for /hadoop/hdfs/data/current
2019-10-29 00:30:11,719 INFO checker.DatasetVolumeChecker (DatasetVolumeChecker.java:checkAllVolumes(210)) - Scheduled health check for volume /hadoop/hdfs/data/current
2019-10-29 00:30:11,721 INFO impl.FsDatasetImpl (FsDatasetImpl.java:addBlockPool(2686)) - Adding block pool BP-814497463-127.0.0.1-1558792659773
2019-10-29 00:30:11,722 INFO impl.FsDatasetImpl (FsVolumeList.java:run(392)) - Scanning block pool BP-814497463-127.0.0.1-1558792659773 on volume /hadoop/hdfs/data/current...
2019-10-29 00:30:11,898 INFO impl.FsDatasetImpl (BlockPoolSlice.java:loadDfsUsed(251)) - Cached dfsUsed found for /hadoop/hdfs/data/current/BP-814497463-127.0.0.1-1558792659773/current: 414855315456
2019-10-29 00:30:11,901 INFO impl.FsDatasetImpl (FsVolumeList.java:run(397)) - Time taken to scan block pool BP-814497463-127.0.0.1-1558792659773 on /hadoop/hdfs/data/current: 178ms
2019-10-29 00:30:11,901 INFO impl.FsDatasetImpl (FsVolumeList.java:addBlockPool(423)) - Total time to scan all replicas for block pool BP-814497463-127.0.0.1-1558792659773: 180ms
2019-10-29 00:30:11,906 INFO impl.FsDatasetImpl (FsVolumeList.java:run(188)) - Adding replicas to map for block pool BP-814497463-127.0.0.1-1558792659773 on volume /hadoop/hdfs/data/current...
2019-10-29 00:30:11,906 INFO impl.BlockPoolSlice (BlockPoolSlice.java:readReplicasFromCache(738)) - Replica Cache file: /hadoop/hdfs/data/current/BP-814497463-127.0.0.1-1558792659773/current/replicas doesn't exist
2019-10-29 00:31:24,607 INFO timeline.HadoopTimelineMetricsSink
The corresponding NameNode log shows the following and keeps repeating "The reported blocks 0 needs additional 1429 blocks to reach the threshold 1.0000 of total blocks 1428."
***********************************************/
2019-10-29 00:30:20,165 INFO namenode.NameNode (LogAdapter.java:info(47)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: user = hdfs
STARTUP_MSG: host = xxxx.corp.com/scrambled.private.ip.address
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.7.3.2.6.5.1100-53
STARTUP_MSG: classpath = removed for brevity
STARTUP_MSG: build = git#github.com:hortonworks/hadoop.git -r 3091053c59a62c82d82c9f778c48bde5ef0a89a1; compiled by 'jenkins' on 2019-03-13T15:40Z
STARTUP_MSG: java = 1.8.0_112
***************/
2019-10-29 00:30:20,176 INFO namenode.NameNode (NameNode.java:createNameNode(1624)) - createNameNode []
2019-10-29 00:30:20,747 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(450)) - fs.defaultFS is hdfs://xxxx.corp.com:8020
2019-10-29 00:30:20,748 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(470)) - Clients are to use xxxx.corp.com:8020 to access this namenode/service.
2019-10-29 00:30:20,866 INFO util.JvmPauseMonitor (JvmPauseMonitor.java:run(179)) - Starting JVM pause monitor
2019-10-29 00:30:20,874 INFO hdfs.DFSUtil (DFSUtil.java:httpServerTemplateForNNAndJN(1803)) - Starting Web-server for hdfs at: http://xxxx.corp.com:50070
2019-10-29 00:30:20,923 INFO server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(296)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2019-10-29 00:30:20,927 INFO http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.namenode is not defined
2019-10-29 00:30:20,931 INFO http.HttpServer2 (HttpServer2.java:addGlobalFilter(788)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2019-10-29 00:30:20,933 INFO http.HttpServer2 (HttpServer2.java:addFilter(763)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2019-10-29 00:30:20,933 INFO http.HttpServer2 (HttpServer2.java:addFilter(771)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2019-10-29 00:30:20,933 INFO http.HttpServer2 (HttpServer2.java:addFilter(771)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2019-10-29 00:30:20,934 INFO security.HttpCrossOriginFilterInitializer (HttpCrossOriginFilterInitializer.java:initFilter(49)) - CORS filter not enabled. Please set hadoop.http.cross-origin.enabled to 'true' to enable it
2019-10-29 00:30:20,953 INFO http.HttpServer2 (NameNodeHttpServer.java:initWebHdfs(93)) - Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter)
2019-10-29 00:30:20,954 INFO http.HttpServer2 (HttpServer2.java:addJerseyResourcePackage(687)) - addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2019-10-29 00:30:20,961 INFO http.HttpServer2 (HttpServer2.java:bindListener(986)) - Jetty bound to port 50070
2019-10-29 00:30:20,962 INFO mortbay.log (Slf4jLog.java:info(67)) - jetty-6.1.26.hwx
2019-10-29 00:30:20,986 WARN mortbay.log (Slf4jLog.java:warn(76)) - Can't reuse /tmp/Jetty_xxxx_corp_com_50070_hdfs____ggu70m, using /tmp/Jetty_xxxx_corp_com_50070_hdfs____ggu70m_2845921744604868870
2019-10-29 00:30:21,121 INFO mortbay.log (Slf4jLog.java:info(67)) - Started HttpServer2$SelectChannelConnectorWithSafeStartup#xxxx.corp.com:50070
2019-10-29 00:30:21,143 WARN common.Util (Util.java:stringAsURI(57)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2019-10-29 00:30:21,143 WARN common.Util (Util.java:stringAsURI(57)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2019-10-29 00:30:21,143 WARN namenode.FSNamesystem (FSNamesystem.java:checkConfiguration(690)) - Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
2019-10-29 00:30:21,143 WARN namenode.FSNamesystem (FSNamesystem.java:checkConfiguration(695)) - Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories!
2019-10-29 00:30:21,148 WARN common.Util (Util.java:stringAsURI(57)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2019-10-29 00:30:21,148 WARN common.Util (Util.java:stringAsURI(57)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2019-10-29 00:30:21,153 WARN common.Storage (NNStorage.java:setRestoreFailedStorage(208)) - set restore failed storage to true
2019-10-29 00:30:21,172 INFO namenode.FSEditLog (FSEditLog.java:newInstance(225)) - Edit logging is async:false
2019-10-29 00:30:21,176 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(759)) - No KeyProvider found.
2019-10-29 00:30:21,176 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(765)) - Enabling async auditlog
2019-10-29 00:30:21,178 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(769)) - fsLock is fair:false
2019-10-29 00:30:21,204 INFO blockmanagement.HeartbeatManager (HeartbeatManager.java:<init>(91)) - Setting heartbeat recheck interval to 30000 since dfs.namenode.stale.datanode.interval is less than dfs.namenode.heartbeat.recheck-interval
2019-10-29 00:30:21,207 INFO common.Util (Util.java:isDiskStatsEnabled(111)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2019-10-29 00:30:21,214 INFO blockmanagement.DatanodeManager (DatanodeManager.java:<init>(274)) - dfs.block.invalidate.limit=1000
2019-10-29 00:30:21,214 INFO blockmanagement.DatanodeManager (DatanodeManager.java:<init>(280)) - dfs.namenode.datanode.registration.ip-hostname-check=true
2019-10-29 00:30:21,215 INFO blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:01:00:00.000
2019-10-29 00:30:21,215 INFO blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2019 Oct 29 01:30:21
2019-10-29 00:30:21,217 INFO util.GSet (LightWeightGSet.java:computeCapacity(395)) - Computing capacity for map BlocksMap
2019-10-29 00:30:21,217 INFO util.GSet (LightWeightGSet.java:computeCapacity(396)) - VM type = 64-bit
2019-10-29 00:30:21,218 INFO util.GSet (LightWeightGSet.java:computeCapacity(397)) - 2.0% max memory 1011.3 MB = 20.2 MB
2019-10-29 00:30:21,218 INFO util.GSet (LightWeightGSet.java:computeCapacity(402)) - capacity = 2^21 = 2097152 entries
2019-10-29 00:30:21,231 INFO blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(409)) - dfs.block.access.token.enable=true
2019-10-29 00:30:21,231 INFO blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(430)) - dfs.block.access.key.update.interval=600 min(s), dfs.block.access.token.lifetime=600 min(s), dfs.encrypt.data.transfer.algorithm=null
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(395)) - defaultReplication = 1
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(396)) - maxReplication = 50
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(397)) - minReplication = 1
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(398)) - maxReplicationStreams = 2
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(399)) - replicationRecheckInterval = 3000
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(400)) - encryptDataTransfer = false
2019-10-29 00:30:21,354 INFO blockmanagement.BlockManager (BlockManager.java:<init>(401)) - maxNumBlocksToLog = 1000
2019-10-29 00:30:21,360 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(790)) - fsOwner = hdfs (auth:SIMPLE)
2019-10-29 00:30:21,360 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(791)) - supergroup = hdfs
2019-10-29 00:30:21,360 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(792)) - isPermissionEnabled = true
2019-10-29 00:30:21,360 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(803)) - HA Enabled: false
2019-10-29 00:30:21,361 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(843)) - Append Enabled: true
2019-10-29 00:30:21,388 INFO util.GSet (LightWeightGSet.java:computeCapacity(395)) - Computing capacity for map INodeMap
2019-10-29 00:30:21,388 INFO util.GSet (LightWeightGSet.java:computeCapacity(396)) - VM type = 64-bit
2019-10-29 00:30:21,388 INFO util.GSet (LightWeightGSet.java:computeCapacity(397)) - 1.0% max memory 1011.3 MB = 10.1 MB
2019-10-29 00:30:21,389 INFO util.GSet (LightWeightGSet.java:computeCapacity(402)) - capacity = 2^20 = 1048576 entries
2019-10-29 00:30:21,393 INFO namenode.FSDirectory (FSDirectory.java:<init>(256)) - ACLs enabled? false
2019-10-29 00:30:21,393 INFO namenode.FSDirectory (FSDirectory.java:<init>(260)) - XAttrs enabled? true
2019-10-29 00:30:21,393 INFO namenode.FSDirectory (FSDirectory.java:<init>(268)) - Maximum size of an xattr: 16384
2019-10-29 00:30:21,393 INFO namenode.NameNode (FSDirectory.java:<init>(321)) - Caching file names occuring more than 10 times
2019-10-29 00:30:21,399 INFO util.GSet (LightWeightGSet.java:computeCapacity(395)) - Computing capacity for map cachedBlocks
2019-10-29 00:30:21,399 INFO util.GSet (LightWeightGSet.java:computeCapacity(396)) - VM type = 64-bit
2019-10-29 00:30:21,400 INFO util.GSet (LightWeightGSet.java:computeCapacity(397)) - 0.25% max memory 1011.3 MB = 2.5 MB
2019-10-29 00:30:21,400 INFO util.GSet (LightWeightGSet.java:computeCapacity(402)) - capacity = 2^18 = 262144 entries
2019-10-29 00:30:21,402 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(5582)) - dfs.namenode.safemode.threshold-pct = 1.0
2019-10-29 00:30:21,402 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(5583)) - dfs.namenode.safemode.min.datanodes = 0
2019-10-29 00:30:21,402 INFO namenode.FSNamesystem (FSNamesystem.java:<init>(5584)) - dfs.namenode.safemode.extension = 30000
2019-10-29 00:30:21,405 INFO metrics.TopMetrics (TopMetrics.java:logConf(76)) - NNTop conf: dfs.namenode.top.window.num.buckets = 10
2019-10-29 00:30:21,405 INFO metrics.TopMetrics (TopMetrics.java:logConf(78)) - NNTop conf: dfs.namenode.top.num.users = 10
2019-10-29 00:30:21,405 INFO metrics.TopMetrics (TopMetrics.java:logConf(80)) - NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2019-10-29 00:30:21,408 INFO namenode.FSNamesystem (FSNamesystem.java:initRetryCache(971)) - Retry cache on namenode is enabled
2019-10-29 00:30:21,408 INFO namenode.FSNamesystem (FSNamesystem.java:initRetryCache(979)) - Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2019-10-29 00:30:21,410 INFO util.GSet (LightWeightGSet.java:computeCapacity(395)) - Computing capacity for map NameNodeRetryCache
2019-10-29 00:30:21,411 INFO util.GSet (LightWeightGSet.java:computeCapacity(396)) - VM type = 64-bit
2019-10-29 00:30:21,411 INFO util.GSet (LightWeightGSet.java:computeCapacity(397)) - 0.029999999329447746% max memory 1011.3 MB = 310.7 KB
2019-10-29 00:30:21,411 INFO util.GSet (LightWeightGSet.java:computeCapacity(402)) - capacity = 2^15 = 32768 entries
2019-10-29 00:30:21,456 INFO common.Storage (Storage.java:tryLock(776)) - Lock on /home/hadoop/hdfs/namenode/in_use.lock acquired by nodename 211070#xxxx.corp.com
2019-10-29 00:30:21,503 INFO namenode.FileJournalManager (FileJournalManager.java:recoverUnfinalizedSegments(388)) - Recovering unfinalized segments in /home/hadoop/hdfs/namenode/current
2019-10-29 00:30:21,527 INFO namenode.FileJournalManager (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file /home/hadoop/hdfs/namenode/current/edits_inprogress_0000000000199282266 -> /home/hadoop/hdfs/namenode/current/edits_0000000000199282266-0000000000199282266
2019-10-29 00:30:21,532 INFO namenode.FSImage (FSImage.java:loadFSImageFile(745)) - Planning to load image: FSImageFile(file=/home/hadoop/hdfs/namenode/current/fsimage_0000000000199282232, cpktTxId=0000000000199282232)
2019-10-29 00:30:21,562 INFO namenode.FSImageFormatPBINode (FSImageFormatPBINode.java:loadINodeSection(257)) - Loading 1993 INodes.
2019-10-29 00:30:21,724 INFO namenode.FSImageFormatProtobuf (FSImageFormatProtobuf.java:load(184)) - Loaded FSImage in 0 seconds.
2019-10-29 00:30:21,725 INFO namenode.FSImage (FSImage.java:loadFSImage(911)) - Loaded image for txid 199282232 from /home/hadoop/hdfs/namenode/current/fsimage_0000000000199282232
2019-10-29 00:30:21,725 INFO namenode.FSImage (FSImage.java:loadEdits(849)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream#63fd4873 expecting start txid #199282233
2019-10-29 00:30:21,726 INFO namenode.FSImage (FSEditLogLoader.java:loadFSEdits(142)) - Start loading edits file /home/hadoop/hdfs/namenode/current/edits_0000000000199282233-0000000000199282265
2019-10-29 00:30:21,729 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream '/home/hadoop/hdfs/namenode/current/edits_0000000000199282233-0000000000199282265' to transaction ID 199282233
2019-10-29 00:30:21,752 INFO namenode.FSImage (FSEditLogLoader.java:loadFSEdits(145)) - Edits file /home/hadoop/hdfs/namenode/current/edits_0000000000199282233-0000000000199282265 of size 1048576 edits # 33 loaded in 0 seconds
2019-10-29 00:30:21,752 INFO namenode.FSImage (FSImage.java:loadEdits(849)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream#1e11bc55 expecting start txid #199282266
2019-10-29 00:30:21,752 INFO namenode.FSImage (FSEditLogLoader.java:loadFSEdits(142)) - Start loading edits file /home/hadoop/hdfs/namenode/current/edits_0000000000199282266-0000000000199282266
2019-10-29 00:30:21,752 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream '/home/hadoop/hdfs/namenode/current/edits_0000000000199282266-0000000000199282266' to transaction ID 199282233
2019-10-29 00:30:21,753 INFO namenode.FSImage (FSEditLogLoader.java:loadFSEdits(145)) - Edits file /home/hadoop/hdfs/namenode/current/edits_0000000000199282266-0000000000199282266 of size 1048576 edits # 1 loaded in 0 seconds
2019-10-29 00:30:21,753 INFO namenode.FSNamesystem (FSNamesystem.java:loadFSImage(1083)) - Need to save fs image? false (staleImage=false, haEnabled=false, isRollingUpgrade=false)
2019-10-29 00:30:21,754 INFO namenode.FSEditLog (FSEditLog.java:startLogSegment(1294)) - Starting log segment at 199282267
2019-10-29 00:30:21,880 INFO namenode.NameCache (NameCache.java:initialized(143)) - initialized with 8 entries 214 lookups
2019-10-29 00:30:21,881 INFO namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(731)) - Finished loading FSImage in 465 msecs
2019-10-29 00:30:22,002 INFO namenode.NameNode (NameNodeRpcServer.java:<init>(428)) - RPC server is binding to xxxx.corp.com:8020
2019-10-29 00:30:22,007 INFO ipc.CallQueueManager (CallQueueManager.java:<init>(75)) - Using callQueue: class java.util.concurrent.LinkedBlockingQueue scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2019-10-29 00:30:22,015 INFO ipc.Server (Server.java:run(821)) - Starting Socket Reader #1 for port 8020
2019-10-29 00:30:22,049 INFO namenode.FSNamesystem (FSNamesystem.java:registerMBean(6517)) - Registered FSNamesystemState MBean
2019-10-29 00:30:22,050 WARN common.Util (Util.java:stringAsURI(57)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2019-10-29 00:30:22,064 INFO namenode.LeaseManager (LeaseManager.java:getNumUnderConstructionBlocks(139)) - Number of blocks under construction: 0
2019-10-29 00:30:22,065 INFO hdfs.StateChange (FSNamesystem.java:reportStatus(5952)) - STATE* Safe mode ON.
The reported blocks 0 needs additional 1429 blocks to reach the threshold 1.0000 of total blocks 1428.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
2019-10-29 00:30:22,075 INFO blockmanagement.DatanodeDescriptor (DatanodeDescriptor.java:updateHeartbeatState(401)) - Number of failed storage changes from 0 to 0
2019-10-29 00:30:22,077 INFO block.BlockTokenSecretManager (BlockTokenSecretManager.java:updateKeys(222)) - Updating block keys
2019-10-29 00:30:22,095 INFO ipc.Server (Server.java:run(1064)) - IPC Server Responder: starting
2019-10-29 00:30:22,095 INFO ipc.Server (Server.java:run(900)) - IPC Server listener on 8020: starting
2019-10-29 00:30:22,115 INFO namenode.NameNode (NameNode.java:startCommonServices(885)) - NameNode RPC up at: xxxx.corp.com/scrambled.private.ip.address:8020
2019-10-29 00:30:22,116 INFO namenode.FSNamesystem (FSNamesystem.java:startActiveServices(1191)) - Starting services required for active state
2019-10-29 00:30:22,116 INFO namenode.FSDirectory (FSDirectory.java:updateCountForQuota(708)) - Initializing quota with 4 thread(s)
2019-10-29 00:30:22,127 INFO namenode.FSDirectory (FSDirectory.java:updateCountForQuota(717)) - Quota initialization completed in 11 milliseconds
name space=1995
storage space=6473571992
storage types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0
2019-10-29 00:30:22,131 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(161)) - Starting CacheReplicationMonitor with interval 30000 milliseconds
2019-10-29 00:30:22,525 INFO fs.TrashPolicyDefault (TrashPolicyDefault.java:<init>(228)) - The configured checkpoint interval is 0 minutes. Using an interval of 60 minutes that is used for deletion instead
2019-10-29 00:31:52,817 INFO ipc.Server (Server.java:logException(2428)) - IPC Server handler 29 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.mkdirs from scrambled.private.ip.address:55080 Call#143 Retry#0: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /tmp/hive/anonymous/393adcb5-cd66-43a7-ab38-e759f5daf88e. Name node is in safe mode.
The reported blocks 0 needs additional 1429 blocks to reach the threshold 1.0000 of total blocks 1428.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
What exactly is going on and how do I go about troubleshooting this? I tried increasing the HeapSize for both NameNode and DataNode as well. The GC message from DataNode disappear but I still see them in NameNode logs when it reads iNODES.
Any help will be greatly appreciated.

Spark fails to start in local mode when disconnected [Possible bug in handling IPv6 in Spark??]

The problem is the same as described here Error when starting spark-shell local on Mac
... but I have failed to find a solution. I also used to get the malformed URI error but now I get expected hostname.
So when I am not connected to internet, spark shell fails to load in local mode [See the error below]. So I am running Apache Spark 2.1.0 downloaded from internet, running on my Mac. So I run ./bin/spark-shell and it gives me the error below.
So I have read the Spark code and it is using Java's InetAddress.getLocalHost() to find the localhost's IP address. So when I am connected to internet, I get back an IPv4 with my local hostname.
scala> InetAddress.getLocalHost
res9: java.net.InetAddress = AliKheyrollahis-MacBook-Pro.local/192.168.1.26
but the key is, when disconnected, I get an IPv6 with a percentage in the values (it is scoped):
scala> InetAddress.getLocalHost
res10: java.net.InetAddress = AliKheyrollahis-MacBook-Pro.local/fe80:0:0:0:2b9a:4521:a301:e9a5%10
And this IP is the same as the one you see in the error message. I feel my problem is that it throws Spark since it cannot handle %10 in the result.
My guess is this is a bug, probably witnessed by very few since people always connected to internet or their mac does not return a scoped IPv6. Even if I can configure my Mac to get around this issue, I am happy. I have done anything including setting IPv6 to manual or Link-local only to no avail.
I have also tried removing ::1 localhost line in /etc/hosts to no avail.
So here is the full error with DEBUG output (please note the same IPv6 being used to listen):
7/01/28 22:02:59 DEBUG ShutdownHookManager: Adding shutdown hook
17/01/28 22:03:06 DEBUG Shell: setsid is not available on this machine. So not using it.
17/01/28 22:03:06 DEBUG Shell: setsid exited with exit code 0
17/01/28 22:03:06 INFO SparkContext: Running Spark version 2.1.0
17/01/28 22:03:06 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation #org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops, always=false, about=, type=DEFAULT, value=[Rate of successful kerberos logins and latency (milliseconds)], valueName=Time)
17/01/28 22:03:06 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation #org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops, always=false, about=, type=DEFAULT, value=[Rate of failed kerberos logins and latency (milliseconds)], valueName=Time)
17/01/28 22:03:06 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation #org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops, always=false, about=, type=DEFAULT, value=[GetGroups], valueName=Time)
17/01/28 22:03:06 DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics
17/01/28 22:03:26 DEBUG KerberosName: Kerberos krb5 configuration not found, setting default realm to empty
17/01/28 22:03:26 DEBUG Groups: Creating new Groups object
17/01/28 22:03:26 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
17/01/28 22:03:26 DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
17/01/28 22:03:26 DEBUG NativeCodeLoader: java.library.path=/Users/aliostad/torch/install/lib:/Users/aliostad/torch/install/lib:/Users/aliostad/torch/install/lib:/Users/aliostad/torch/install/lib:/Users/aliostad/torch/install/lib:/Users/aliostad/torch/install/lib:/Users/aliostad/torch/install/lib::/Users/aliostad/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
17/01/28 22:03:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/01/28 22:03:26 DEBUG PerformanceAdvisory: Falling back to shell based
17/01/28 22:03:26 DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
17/01/28 22:03:27 DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
17/01/28 22:03:27 DEBUG UserGroupInformation: hadoop login
17/01/28 22:03:27 DEBUG UserGroupInformation: hadoop login commit
17/01/28 22:03:27 DEBUG UserGroupInformation: using local user:UnixPrincipal: aliostad
17/01/28 22:03:27 DEBUG UserGroupInformation: Using user: "UnixPrincipal: aliostad" with name aliostad
17/01/28 22:03:27 DEBUG UserGroupInformation: User entry: "aliostad"
17/01/28 22:03:27 DEBUG UserGroupInformation: UGI loginUser:aliostad (auth:SIMPLE)
17/01/28 22:03:27 INFO SecurityManager: Changing view acls to: aliostad
17/01/28 22:03:27 INFO SecurityManager: Changing modify acls to: aliostad
17/01/28 22:03:27 INFO SecurityManager: Changing view acls groups to:
17/01/28 22:03:27 INFO SecurityManager: Changing modify acls groups to:
17/01/28 22:03:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aliostad); groups with view permissions: Set(); users with modify permissions: Set(aliostad); groups with modify permissions: Set()
17/01/28 22:03:27 DEBUG SecurityManager: Created SSL options for fs: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
17/01/28 22:03:27 DEBUG InternalLoggerFactory: Using SLF4J as the default logging framework
17/01/28 22:03:27 DEBUG PlatformDependent0: java.nio.Buffer.address: available
17/01/28 22:03:27 DEBUG PlatformDependent0: sun.misc.Unsafe.theUnsafe: available
17/01/28 22:03:27 DEBUG PlatformDependent0: sun.misc.Unsafe.copyMemory: available
17/01/28 22:03:27 DEBUG PlatformDependent0: direct buffer constructor: available
17/01/28 22:03:27 DEBUG PlatformDependent0: java.nio.Bits.unaligned: available, true
17/01/28 22:03:27 DEBUG PlatformDependent0: java.nio.DirectByteBuffer.<init>(long, int): available
17/01/28 22:03:27 DEBUG Cleaner0: java.nio.ByteBuffer.cleaner(): available
17/01/28 22:03:27 DEBUG PlatformDependent: Java version: 8
17/01/28 22:03:27 DEBUG PlatformDependent: -Dio.netty.noUnsafe: false
17/01/28 22:03:27 DEBUG PlatformDependent: sun.misc.Unsafe: available
17/01/28 22:03:27 DEBUG PlatformDependent: -Dio.netty.noJavassist: false
17/01/28 22:03:27 DEBUG PlatformDependent: Javassist: available
17/01/28 22:03:27 DEBUG PlatformDependent: -Dio.netty.tmpdir: /var/folders/pz/vgqg2gns18j_kxsnkzrp6x_m0000gn/T (java.io.tmpdir)
17/01/28 22:03:27 DEBUG PlatformDependent: -Dio.netty.bitMode: 64 (sun.arch.data.model)
17/01/28 22:03:27 DEBUG PlatformDependent: -Dio.netty.noPreferDirect: false
17/01/28 22:03:27 DEBUG PlatformDependent: io.netty.maxDirectMemory: 0 bytes
17/01/28 22:03:27 DEBUG JavassistTypeParameterMatcherGenerator: Generated: io.netty.util.internal.__matchers__.org.apache.spark.network.protocol.MessageMatcher
17/01/28 22:03:27 DEBUG JavassistTypeParameterMatcherGenerator: Generated: io.netty.util.internal.__matchers__.io.netty.buffer.ByteBufMatcher
17/01/28 22:03:27 DEBUG MultithreadEventLoopGroup: -Dio.netty.eventLoopThreads: 8
17/01/28 22:03:27 DEBUG NioEventLoop: -Dio.netty.noKeySetOptimization: false
17/01/28 22:03:27 DEBUG NioEventLoop: -Dio.netty.selectorAutoRebuildThreshold: 512
17/01/28 22:03:27 DEBUG PlatformDependent: org.jctools-core.MpscChunkedArrayQueue: available
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numHeapArenas: 8
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.numDirectArenas: 8
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.pageSize: 8192
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxOrder: 11
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.chunkSize: 16777216
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.tinyCacheSize: 512
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.smallCacheSize: 256
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.normalCacheSize: 64
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.maxCachedBufferCapacity: 32768
17/01/28 22:03:27 DEBUG PooledByteBufAllocator: -Dio.netty.allocator.cacheTrimInterval: 8192
17/01/28 22:03:27 DEBUG ThreadLocalRandom: -Dio.netty.initialSeedUniquifier: 0x3185a000d3a47bd4 (took 1 ms)
17/01/28 22:03:27 DEBUG ByteBufUtil: -Dio.netty.allocator.type: unpooled
17/01/28 22:03:27 DEBUG ByteBufUtil: -Dio.netty.threadLocalDirectBufferSize: 65536
17/01/28 22:03:27 DEBUG ByteBufUtil: -Dio.netty.maxThreadLocalCharBufferSize: 16384
17/01/28 22:03:27 DEBUG NetUtil: Loopback interface: lo0 (lo0, 0:0:0:0:0:0:0:1)
17/01/28 22:03:27 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 128 (non-existent)
17/01/28 22:03:27 DEBUG TransportServer: Shuffle server started on port: 56107
17/01/28 22:03:27 INFO Utils: Successfully started service 'sparkDriver' on port 56107.
17/01/28 22:03:27 DEBUG SparkEnv: Using serializer: class org.apache.spark.serializer.JavaSerializer
17/01/28 22:03:27 INFO SparkEnv: Registering MapOutputTracker
17/01/28 22:03:27 DEBUG MapOutputTrackerMasterEndpoint: init
17/01/28 22:03:27 INFO SparkEnv: Registering BlockManagerMaster
17/01/28 22:03:27 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/01/28 22:03:27 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/01/28 22:03:28 INFO DiskBlockManager: Created local directory at /private/var/folders/pz/vgqg2gns18j_kxsnkzrp6x_m0000gn/T/blockmgr-4079e45b-e4e0-4386-bffe-42af18634710
17/01/28 22:03:28 DEBUG DiskBlockManager: Adding shutdown hook
17/01/28 22:03:28 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
17/01/28 22:03:28 INFO SparkEnv: Registering OutputCommitCoordinator
17/01/28 22:03:28 DEBUG OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: init
17/01/28 22:03:28 DEBUG SecurityManager: Created SSL options for ui: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
17/01/28 22:03:28 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/01/28 22:03:28 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://fe80:0:0:0:2b9a:4521:a301:e9a5%10:4040
17/01/28 22:03:28 INFO Executor: Starting executor ID driver on host localhost
17/01/28 22:03:28 INFO Executor: Using REPL class URI: spark://fe80:0:0:0:2b9a:4521:a301:e9a5%10:56107/classes
17/01/28 22:03:28 ERROR SparkContext: Error initializing SparkContext.
java.lang.AssertionError: assertion failed: Expected hostname
at scala.Predef$.assert(Predef.scala:170)
at org.apache.spark.util.Utils$.checkHost(Utils.scala:931)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:31)
at org.apache.spark.executor.Executor.<init>(Executor.scala:121)
at org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:59)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:126)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
at $line3.$read$$iw$$iw.<init>(<console>:15)
at $line3.$read$$iw.<init>(<console>:42)
at $line3.$read.<init>(<console>:44)
at $line3.$read$.<init>(<console>:48)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.$print$lzycompute(<console>:7)
at $line3.$eval$.$print(<console>:6)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:68)
at org.apache.spark.repl.Main$.main(Main.scala:51)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/28 22:03:28 INFO SparkUI: Stopped Spark web UI at http://fe80:0:0:0:2b9a:4521:a301:e9a5%10:4040
17/01/28 22:03:28 ERROR Utils: Uncaught exception in thread main
java.lang.NullPointerException
at org.apache.spark.scheduler.local.LocalSchedulerBackend.org$apache$spark$scheduler$local$LocalSchedulerBackend$$stop(LocalSchedulerBackend.scala:158)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.stop(LocalSchedulerBackend.scala:137)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:467)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1588)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1826)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1283)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1825)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:587)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
at $line3.$read$$iw$$iw.<init>(<console>:15)
at $line3.$read$$iw.<init>(<console>:42)
at $line3.$read.<init>(<console>:44)
at $line3.$read$.<init>(<console>:48)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.$print$lzycompute(<console>:7)
at $line3.$eval$.$print(<console>:6)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:68)
at org.apache.spark.repl.Main$.main(Main.scala:51)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/01/28 22:03:28 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/01/28 22:03:28 INFO MemoryStore: MemoryStore cleared
17/01/28 22:03:28 INFO BlockManager: BlockManager stopped
17/01/28 22:03:28 INFO BlockManagerMaster: BlockManagerMaster stopped
17/01/28 22:03:28 WARN MetricsSystem: Stopping a MetricsSystem that is not running
17/01/28 22:03:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/01/28 22:03:28 INFO SparkContext: Successfully stopped SparkContext
java.lang.AssertionError: assertion failed: Expected hostname
at scala.Predef$.assert(Predef.scala:170)
at org.apache.spark.util.Utils$.checkHost(Utils.scala:931)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:31)
at org.apache.spark.executor.Executor.<init>(Executor.scala:121)
at org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:59)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:126)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
... 47 elided
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_40)
Type in expressions to have them evaluated.
Type :help for more information.
OK, I seem to be able to get around it by passing configuration directly --conf spark.driver.host=localhost
So I run:
./bin/spark-shell --conf spark.driver.host=localhost
Still if there is a better solution, please let me know.
[UPDATE]
Jacek Laskowski confirmed this is probably the only available solution for now.
To those who are working with spark through sbt and having the same issue. Just add .set("spark.driver.host", "localhost") to your SparkConf() so initialisation of spark context will look like this:
val conf =
new SparkConf()
.setAppName( "temp1" )
.setMaster( "local" )
.set( "spark.driver.host", "localhost" )
val sc =
SparkContext
.getOrCreate( conf )
This initial configuration must be done before any other getOrCreate of SparkContext.
The first thing to check is probably /etc/hosts. Make sure that you have the following entry:
127.0.0.1 localhost
If the above does not work, then the following should do the trick:
sudo hostname -s 127.0.0.1
I faced the same issue while using the SharedSparkContext with my tests.
Adding those two lines (in my beforeAll method) as #dennis suggested solved the problem for me :
override def beforeAll(): Unit = {
super.beforeAll()
sc.getConf.setMaster("local").set("spark.driver.host", "localhost")
}
I hope this will be solved in the next versions of Spark.
If you are using pyspark, use the config method to set the host driver to localhost.
spark = (SparkSession
.builder
.appName( "temp1" )
.config( "spark.driver.host", "localhost" )
.getOrCreate()
)

Could you give me any clue Why 'Cannot call methods on a stopped SparkContext'?

When I put the 'val lines = sc.textFile("hdfs:///input")' in yarn-client, 'Cannot call methods on a stopped SparkContext' error occur. I searched all day long for two days, but I don't know where is cause. "hdfs:///input" is right, because when I executed it in standalone mode, I worked well.
Could you give me a any idea of that?
I'm using spark 1.5.2, hadoop 2.7.2.
tarting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
192.168.111.203: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
192.168.111.202: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
[root#master spark-1.5.2-bin-hadoop2.6]# bin/spark-shell --master yarn-client
16/03/19 05:59:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 05:59:12 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:13 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:13 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:13 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:46780
16/03/19 05:59:13 INFO util.Utils: Successfully started service 'HTTP class server' on port 46780.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.2
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_73)
Type in expressions to have them evaluated.
Type :help for more information.
16/03/19 05:59:17 INFO spark.SparkContext: Running Spark version 1.5.2
16/03/19 05:59:17 WARN spark.SparkConf:
SPARK_JAVA_OPTS was detected (set to '-Dspark.driver.port=53411').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with conf/spark-defaults.conf to set defaults for an application
- ./spark-submit with --driver-java-options to set -X options for a driver
- spark.executor.extraJavaOptions to set -X options for executors
- SPARK_DAEMON_JAVA_OPTS to set java options for standalone daemons (master or worker)
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.executor.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.driver.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:18 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/03/19 05:59:18 INFO Remoting: Starting remoting
16/03/19 05:59:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.111.201:53411]
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'sparkDriver' on port 53411.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering MapOutputTracker
16/03/19 05:59:18 INFO spark.SparkEnv: Registering BlockManagerMaster
16/03/19 05:59:18 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-f70b1bb6-288b-4894-bb49-22d1fc3d8d89
16/03/19 05:59:18 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
16/03/19 05:59:18 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/httpd-fe0c46a2-1d87-4bc7-8b4f-adfc79cb762a
16/03/19 05:59:18 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:40258
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'HTTP file server' on port 40258.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SelectChannelConnector#0.0.0.0:4040
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/03/19 05:59:18 INFO ui.SparkUI: Started SparkUI at http://192.168.111.201:4040
16/03/19 05:59:19 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/03/19 05:59:19 INFO client.RMProxy: Connecting to ResourceManager at /192.168.111.201:8032
16/03/19 05:59:19 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/03/19 05:59:19 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/03/19 05:59:19 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/03/19 05:59:19 INFO yarn.Client: Setting up container launch context for our AM
16/03/19 05:59:19 INFO yarn.Client: Setting up the launch environment for our AM container
16/03/19 05:59:19 INFO yarn.Client: Preparing resources for our AM container
16/03/19 05:59:21 INFO yarn.Client: Uploading resource file:/opt/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/spark-assembly-1.5.2-hadoop2.6.0.jar
16/03/19 05:59:25 INFO yarn.Client: Uploading resource file:/tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/__spark_conf__2052137095112870542.zip -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/__spark_conf__2052137095112870542.zip
16/03/19 05:59:25 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:25 INFO yarn.Client: Submitting application 2 to ResourceManager
16/03/19 05:59:25 INFO impl.YarnClientImpl: Submitted application application_1458334003417_0002
16/03/19 05:59:26 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:26 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:27 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:28 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:29 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:30 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:31 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:32 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:33 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:34 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:35 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:46505/user/YarnAM#149895142])
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:35 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:46505] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 INFO yarn.Client: Application report for application_1458334003417_0002 (state: RUNNING)
16/03/19 05:59:35 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.111.203
ApplicationMaster RPC port: 0
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Application application_1458334003417_0002 has started running.
16/03/19 05:59:36 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42938.
16/03/19 05:59:36 INFO netty.NettyBlockTransferService: Server created on 42938
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/03/19 05:59:36 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.111.201:42938 with 534.5 MB RAM, BlockManagerId(driver, 192.168.111.201, 42938)
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Registered BlockManager
16/03/19 05:59:40 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:34633/user/YarnAM#-40449267])
16/03/19 05:59:40 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:40 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/03/19 05:59:41 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.111.201:4040
16/03/19 05:59:41 INFO scheduler.DAGScheduler: Stopping DAGScheduler
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Stopped
16/03/19 05:59:42 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/19 05:59:42 INFO storage.MemoryStore: MemoryStore cleared
16/03/19 05:59:42 INFO storage.BlockManager: BlockManager stopped
16/03/19 05:59:42 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/03/19 05:59:42 INFO spark.SparkContext: Successfully stopped SparkContext
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/03/19 05:59:49 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/03/19 05:59:49 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.
16/03/19 05:59:49 INFO hive.HiveContext: Initializing execution hive, version 1.2.1
16/03/19 05:59:49 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:49 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 05:59:50 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 05:59:50 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 05:59:50 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:51 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:53 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 05:59:56 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 05:59:57 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 05:59:57 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 05:59:58 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 05:59:58 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/e16dc45f-de41-4e69-9f73-c976cc3358c9_resources
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9/_tmp_space.db
16/03/19 05:59:58 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse
16/03/19 05:59:58 INFO hive.HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/03/19 05:59:58 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:59 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 06:00:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 06:00:00 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 06:00:00 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:01 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 06:00:04 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 06:00:04 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 06:00:05 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 06:00:05 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 06:00:05 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 06:00:06 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 06:00:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 06:00:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/b046e212-ccbd-4415-aec3-5b207f147fda_resources
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda/_tmp_space.db
16/03/19 06:00:06 INFO repl.SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.
scala> val lines = sc.textFile("hdfs:///input")
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
at org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:2063)
at org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:2076)
at org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:825)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
at $iwC$$iwC$$iwC.<init>(<console>:34)
at $iwC$$iwC.<init>(<console>:36)
at $iwC.<init>(<console>:38)
at <init>(<console>:40)
at .<init>(<console>:44)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I encountered this in my Spark Structured Streaming application when I forgot to include the following:
spark.streams.awaitAnyTermination()
Your YARN application exits immediately after it starts:
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
Then, SparkContext is closed, so any action on this context will throw the exception you see.
Check the "Application Master" logs (visible through YARN's UI) to see the cause for the failure. This could be a memory configuration issue, network issues (e.g. host unreachable) and more - the log on the driver side (which is what you pasted) won't tell you which one it is.

Running Spark on the slave node (YARN) doesn't work

I can run SparkPi example on the master node, but when I try the same command
"spark-submit --class SparkPi --master yarn-client sparkpi.jar 10"
on the slave node, I got an error:
2015-05-19 14:05:44,881 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: maintainer
2015-05-19 14:05:44,886 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: maintainer
2015-05-19 14:05:44,887 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(maintainer); users with modify permissions: Set(maintainer)
2015-05-19 14:05:45,389 INFO [sparkDriver-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-05-19 14:05:45,443 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-05-19 14:05:45,641 INFO [sparkDriver-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver#slave2.com:33055]
2015-05-19 14:05:45,644 INFO [sparkDriver-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriver#slave2.com:33055]
2015-05-19 14:05:45,653 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 33055.
2015-05-19 14:05:45,674 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker
2015-05-19 14:05:45,688 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster
2015-05-19 14:05:45,707 INFO [main] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Created local directory at /tmp/spark-local-20150519140545-c81b
2015-05-19 14:05:45,712 INFO [main] storage.MemoryStore (Logging.scala:logInfo(59)) - MemoryStore started with capacity 265.4 MB
2015-05-19 14:05:46,205 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-05-19 14:05:46,408 INFO [main] spark.HttpFileServer (Logging.scala:logInfo(59)) - HTTP File server directory is /tmp/spark-e95a2b5b-efea-41eb-93b9-0a9f7d6f6701
2015-05-19 14:05:46,413 INFO [main] spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server
2015-05-19 14:05:46,477 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-05-19 14:05:46,499 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SocketConnector#0.0.0.0:52737
2015-05-19 14:05:46,500 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 52737.
2015-05-19 14:05:46,790 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-05-19 14:05:46,805 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:4040
2015-05-19 14:05:46,805 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040.
2015-05-19 14:05:46,808 INFO [main] ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://slave2.com:4040
2015-05-19 14:05:47,058 INFO [main] spark.SparkContext (Logging.scala:logInfo(59)) - Added JAR file:/home/maintainer/myjars/sparkpi.jar at http://[ip]:52737/jars/sparkpi.jar with timestamp 1432033547057
2015-05-19 14:05:47,190 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
2015-05-19 14:09:45,861 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
**2015-05-19 14:09:47,067 INFO [main] ipc.Client (Client.java:handleConnectionFailure(842)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-05-19 14:09:48,068 INFO [main] ipc.Client (Client.java:handleConnectionFailure(842)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
...**
Aside from specifying yarn.resourcemanager.hostname property in yarn-site.xml, it's also necessary to propagate configuration files to workers.
It might be done with this line (before running spark-submit):
export SPARK_YARN_DIST_FILES=$(ls $HADOOP_CONF_DIR* | sed 's#^#file://#g' | tr '\n' ',' | sed 's/,$//')
If everything's configured correctly, you'll see RM hostname instead of 0.0.0.0 in this line:
2015-05-19 14:05:47,190 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
Exporting correct values for HADOOP_CONF_DIR fixed the issue.
export HADOOP_CONF_DIR=/your-path/hadoop/conf

Got exception "unread block data" when reading Hbase table to Spark(1.2.0.2.2.0.0-82) RDD using PySpark on Yarn-Client on HDP (2.2) plantform

I have a strange exception when reading Hbase (0.98.4.2.2.0.0) table to Spark (1.2.0.2.2.0.0-82) RDD using PySpark on Yarn-Client(2.6.0) on HDP(2.2) plantform:
2015-04-14 19:05:11,295 WARN [task-result-getter-0] scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.0 in stage 0.0 (TID 0, hadoop-node05.mathartsys.com): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I followed the Spark example Python code:(https://github.com/apache/spark/blob/master/examples/src/main/python/hbase_inputformat.py)
and my code is :
import sys
from pyspark import SparkContext
if __name__ == "__main__":
sc = SparkContext(appName="HBaseInputFormat")
conf = {"hbase.zookeeper.quorum": "hadoop-node01.mathartsys.com,hadoop-node02.mathartsys.com,hadoop-node03.mathartsys.com",
"hbase.mapreduce.inputtable": "test",
"hbase.cluster.distributed":"true",
"hbase.rootdir":"hdfs://hadoop-node01.mathartsys.com:8020/apps/hbase/data",
"hbase.zookeeper.property.clientPort":"2181",
"zookeeper.session.timeout":"30000",
"zookeeper.znode.parent":"/hbase-unsecure"}
keyConv = "org.apache.spark.examples.pythonconverters.ImmutableBytesWritableToStringConverter"
valueConv = "org.apache.spark.examples.pythonconverters.HBaseResultToStringConverter"
hbase_rdd = sc.newAPIHadoopRDD(
"org.apache.hadoop.hbase.mapreduce.TableInputFormat",
"org.apache.hadoop.hbase.io.ImmutableBytesWritable",
"org.apache.hadoop.hbase.client.Result",
keyConverter=keyConv,
valueConverter=valueConv,
conf=conf)
output = hbase_rdd.collect()
for (k, v) in output:
print (k, v)
sc.stop()
and submitted the job like this:
spark-submit --master yarn-client --driver-class-path /opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/*:/usr/hdp/current/hbase-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/* hbase_inputformat.py
My environment is:
Centos 6.5
HDP 2.2
Spark 1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041
Can give some suggestion to solve it?!
The full log is:
[root#hadoop-node03 hbase]# spark-submit --master yarn-client --driver-class-path /opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/*:/usr/hdp/current/hbase-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/* hbase_test2.py
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-examples-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2015-04-14 22:41:34,839 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: root
2015-04-14 22:41:34,846 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: root
2015-04-14 22:41:34,847 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
2015-04-14 22:41:35,459 INFO [sparkDriver-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-04-14 22:41:35,524 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-04-14 22:41:35,754 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver#hadoop-node03.mathartsys.com:44295]
2015-04-14 22:41:35,764 INFO [Thread-2] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 44295.
2015-04-14 22:41:35,790 INFO [Thread-2] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker
2015-04-14 22:41:35,806 INFO [Thread-2] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster
2015-04-14 22:41:35,826 INFO [Thread-2] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Created local directory at /tmp/spark-local-20150414224135-a290
2015-04-14 22:41:35,832 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - MemoryStore started with capacity 265.4 MB
2015-04-14 22:41:36,535 WARN [Thread-2] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-04-14 22:41:36,823 INFO [Thread-2] spark.HttpFileServer (Logging.scala:logInfo(59)) - HTTP File server directory is /tmp/spark-b963d482-e9be-476b-85b0-94ab6cd8076c
2015-04-14 22:41:36,830 INFO [Thread-2] spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server
2015-04-14 22:41:36,902 INFO [Thread-2] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-04-14 22:41:36,921 INFO [Thread-2] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SocketConnector#0.0.0.0:58608
2015-04-14 22:41:36,925 INFO [Thread-2] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 58608.
2015-04-14 22:41:37,054 INFO [Thread-2] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-04-14 22:41:37,069 INFO [Thread-2] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:4040
2015-04-14 22:41:37,070 INFO [Thread-2] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040.
2015-04-14 22:41:37,073 INFO [Thread-2] ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://hadoop-node03.mathartsys.com:4040
2015-04-14 22:41:38,034 INFO [Thread-2] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(285)) - Timeline service address: http://hadoop-node02.mathartsys.com:8188/ws/v1/timeline/
2015-04-14 22:41:38,220 INFO [Thread-2] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at hadoop-node02.mathartsys.com/10.0.0.222:8050
2015-04-14 22:41:38,511 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Requesting a new application from cluster with 3 NodeManagers
2015-04-14 22:41:38,536 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Verifying our application has not requested more than the maximum memory capability of the cluster (15360 MB per container)
2015-04-14 22:41:38,537 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Will allocate AM container, with 896 MB memory including 384 MB overhead
2015-04-14 22:41:38,537 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Setting up container launch context for our AM
2015-04-14 22:41:38,544 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Preparing resources for our AM container
2015-04-14 22:41:39,125 WARN [Thread-2] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(116)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2015-04-14 22:41:39,207 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Uploading resource file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar -> hdfs://hadoop-node01.mathartsys.com:8020/user/root/.sparkStaging/application_1428915066363_0013/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar
2015-04-14 22:41:40,428 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Uploading resource file:/root/hbase/hbase_test2.py -> hdfs://hadoop-node01.mathartsys.com:8020/user/root/.sparkStaging/application_1428915066363_0013/hbase_test2.py
2015-04-14 22:41:40,511 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Setting up the launch environment for our AM container
2015-04-14 22:41:40,564 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: root
2015-04-14 22:41:40,564 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: root
2015-04-14 22:41:40,565 INFO [Thread-2] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
2015-04-14 22:41:40,568 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Submitting application 13 to ResourceManager
2015-04-14 22:41:40,609 INFO [Thread-2] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) - Submitted application application_1428915066363_0013
2015-04-14 22:41:41,615 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:41,621 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) -
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1429022500586
final status: UNDEFINED
tracking URL: http://hadoop-node02.mathartsys.com:8088/proxy/application_1428915066363_0013/
user: root
2015-04-14 22:41:42,624 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:43,627 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:44,631 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:45,635 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: ACCEPTED)
2015-04-14 22:41:46,278 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM#hadoop-node05.mathartsys.com:42992/user/YarnAM#708767775]
2015-04-14 22:41:46,284 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoop-node02.mathartsys.com, PROXY_URI_BASES -> http://hadoop-node02.mathartsys.com:8088/proxy/application_1428915066363_0013), /proxy/application_1428915066363_0013
2015-04-14 22:41:46,287 INFO [sparkDriver-akka.actor.default-dispatcher-4] ui.JettyUtils (Logging.scala:logInfo(59)) - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2015-04-14 22:41:46,638 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) - Application report for application_1428915066363_0013 (state: RUNNING)
2015-04-14 22:41:46,639 INFO [Thread-2] yarn.Client (Logging.scala:logInfo(59)) -
client token: N/A
diagnostics: N/A
ApplicationMaster host: hadoop-node05.mathartsys.com
ApplicationMaster RPC port: 0
queue: default
start time: 1429022500586
final status: UNDEFINED
tracking URL: http://hadoop-node02.mathartsys.com:8088/proxy/application_1428915066363_0013/
user: root
2015-04-14 22:41:46,641 INFO [Thread-2] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Application application_1428915066363_0013 has started running.
2015-04-14 22:41:46,795 INFO [Thread-2] netty.NettyBlockTransferService (Logging.scala:logInfo(59)) - Server created on 56053
2015-04-14 22:41:46,797 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Trying to register BlockManager
2015-04-14 22:41:46,800 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager hadoop-node03.mathartsys.com:56053 with 265.4 MB RAM, BlockManagerId(<driver>, hadoop-node03.mathartsys.com, 56053)
2015-04-14 22:41:46,803 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Registered BlockManager
2015-04-14 22:41:55,529 INFO [sparkDriver-akka.actor.default-dispatcher-3] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Registered executor: Actor[akka.tcp://sparkExecutor#hadoop-node06.mathartsys.com:42500/user/Executor#-374031537] with ID 2
2015-04-14 22:41:55,560 INFO [sparkDriver-akka.actor.default-dispatcher-3] util.RackResolver (RackResolver.java:coreResolve(109)) - Resolved hadoop-node06.mathartsys.com to /default-rack
2015-04-14 22:41:55,653 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - Registered executor: Actor[akka.tcp://sparkExecutor#hadoop-node04.mathartsys.com:54112/user/Executor#35135131] with ID 1
2015-04-14 22:41:55,655 INFO [sparkDriver-akka.actor.default-dispatcher-4] util.RackResolver (RackResolver.java:coreResolve(109)) - Resolved hadoop-node04.mathartsys.com to /default-rack
2015-04-14 22:41:55,690 INFO [Thread-2] cluster.YarnClientSchedulerBackend (Logging.scala:logInfo(59)) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
2015-04-14 22:41:55,998 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(298340) called with curMem=0, maxMem=278302556
2015-04-14 22:41:56,001 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0 stored as values in memory (estimated size 291.3 KB, free 265.1 MB)
2015-04-14 22:41:56,160 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(44100) called with curMem=298340, maxMem=278302556
2015-04-14 22:41:56,161 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0_piece0 stored as bytes in memory (estimated size 43.1 KB, free 265.1 MB)
2015-04-14 22:41:56,163 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo (Logging.scala:logInfo(59)) - Added broadcast_0_piece0 in memory on hadoop-node03.mathartsys.com:56053 (size: 43.1 KB, free: 265.4 MB)
2015-04-14 22:41:56,164 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Updated info of block broadcast_0_piece0
2015-04-14 22:41:56,167 INFO [Thread-2] spark.DefaultExecutionContext (Logging.scala:logInfo(59)) - Created broadcast 0 from newAPIHadoopRDD at PythonRDD.scala:516
2015-04-14 22:41:56,204 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(298388) called with curMem=342440, maxMem=278302556
2015-04-14 22:41:56,205 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_1 stored as values in memory (estimated size 291.4 KB, free 264.8 MB)
2015-04-14 22:41:56,279 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(44100) called with curMem=640828, maxMem=278302556
2015-04-14 22:41:56,279 INFO [Thread-2] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_1_piece0 stored as bytes in memory (estimated size 43.1 KB, free 264.8 MB)
2015-04-14 22:41:56,281 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo (Logging.scala:logInfo(59)) - Added broadcast_1_piece0 in memory on hadoop-node03.mathartsys.com:56053 (size: 43.1 KB, free: 265.3 MB)
2015-04-14 22:41:56,281 INFO [Thread-2] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Updated info of block broadcast_1_piece0
2015-04-14 22:41:56,283 INFO [Thread-2] spark.DefaultExecutionContext (Logging.scala:logInfo(59)) - Created broadcast 1 from broadcast at PythonRDD.scala:497
2015-04-14 22:41:56,286 INFO [Thread-2] python.Converter (Logging.scala:logInfo(59)) - Loaded converter: org.apache.spark.examples.pythonconverters.ImmutableBytesWritableToStringConverter
2015-04-14 22:41:56,287 INFO [Thread-2] python.Converter (Logging.scala:logInfo(59)) - Loaded converter: org.apache.spark.examples.pythonconverters.HBaseResultToStringConverter
2015-04-14 22:41:56,400 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager hadoop-node06.mathartsys.com:39033 with 530.3 MB RAM, BlockManagerId(2, hadoop-node06.mathartsys.com, 39033)
2015-04-14 22:41:56,434 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager hadoop-node04.mathartsys.com:33968 with 530.3 MB RAM, BlockManagerId(1, hadoop-node04.mathartsys.com, 33968)
......
......
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:host.name=hadoop-node03.mathartsys.com
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.version=1.7.0_75
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.vendor=Oracle Corporation
2015-04-14 22:41:56,438 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.class.path=:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-1.2.0.2.2.0.0-82-yarn-shuffle.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-examples-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/datanucleus-core-3.2.10.jar:/usr/hdp/current/hbase-client/lib/curator-framework-2.6.0.jar:/usr/hdp/current/hbase-client/lib/commons-math-2.1.jar:/usr/hdp/current/hbase-client/lib/zookeeper.jar:/usr/hdp/current/hbase-client/lib/commons-lang-2.6.jar:/usr/hdp/current/hbase-client/lib/commons-io-2.4.jar:/usr/hdp/current/hbase-client/lib/jersey-server-1.8.jar:/usr/hdp/current/hbase-client/lib/servlet-api-2.5.jar:/usr/hdp/current/hbase-client/lib/gson-2.2.4.jar:/usr/hdp/current/hbase-client/lib/jackson-mapper-asl-1.9.13.jar:/usr/hdp/current/hbase-client/lib/hbase-shell.jar:/usr/hdp/current/hbase-client/lib/api-asn1-api-1.0.0-M20.jar:/usr/hdp/current/hbase-client/lib/jasper-runtime-5.5.23.jar:/usr/hdp/current/hbase-client/lib/xercesImpl-2.9.1.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jsch-0.1.42.jar:/usr/hdp/current/hbase-client/lib/xml-apis-1.3.04.jar:/usr/hdp/current/hbase-client/lib/jetty-6.1.26.jar:/usr/hdp/current/hbase-client/lib/commons-httpclient-3.1.jar:/usr/hdp/current/hbase-client/lib/aopalliance-1.0.jar:/usr/hdp/current/hbase-client/lib/hbase-testing-util-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/hbase-it.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop-compat-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/commons-digester-1.8.jar:/usr/hdp/current/hbase-client/lib/servlet-api-2.5-6.1.14.jar:/usr/hdp/current/hbase-client/lib/hbase-server-0.98.4.2.2.0.0-2041-hadoop2-tests.jar:/usr/hdp/current/hbase-client/lib/hamcrest-core-1.3.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/slf4j-api-1.6.4.jar:/usr/hdp/current/hbase-client/lib/jersey-guice-1.9.jar:/usr/hdp/current/hbase-client/lib/commons-configuration-1.6.jar:/usr/hdp/current/hbase-client/lib/jetty-sslengine-6.1.26.jar:/usr/hdp/current/hbase-client/lib/commons-codec-1.7.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-common-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/commons-el-1.0.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop2-compat.jar:/usr/hdp/current/hbase-client/lib/eclipselink-2.5.2-M1.jar:/usr/hdp/current/hbase-client/lib/jamon-runtime-2.3.1.jar:/usr/hdp/current/hbase-client/lib/xmlenc-0.52.jar:/usr/hdp/current/hbase-client/lib/hbase-prefix-tree-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/curator-recipes-2.6.0.jar:/usr/hdp/current/hbase-client/lib/jersey-core-1.8.jar:/usr/hdp/current/hbase-client/lib/hbase-testing-util.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/hdp/current/hbase-client/lib/hbase-shell-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/commons-beanutils-1.7.0.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop-compat.jar:/usr/hdp/current/hbase-client/lib/leveldbjni-all-1.8.jar:/usr/hdp/current/hbase-client/lib/jasper-compiler-5.5.23.jar:/usr/hdp/current/hbase-client/lib/ojdbc6.jar:/usr/hdp/current/hbase-client/lib/commons-daemon-1.0.13.jar:/usr/hdp/current/hbase-client/lib/api-util-1.0.0-M20.jar:/usr/hdp/current/hbase-client/lib/protobuf-java-2.5.0.jar:/usr/hdp/current/hbase-client/lib/httpclient-4.2.5.jar:/usr/hdp/current/hbase-client/lib/htrace-core-2.04.jar:/usr/hdp/current/hbase-client/lib/jersey-client-1.9.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/guice-servlet-3.0.jar:/usr/hdp/current/hbase-client/lib/metrics-core-2.2.0.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.0.4.jar:/usr/hdp/current/hbase-client/lib/paranamer-2.3.jar:/usr/hdp/current/hbase-client/lib/jackson-core-2.2.3.jar:/usr/hdp/current/hbase-client/lib/commons-compress-1.4.1.jar:/usr/hdp/current/hbase-client/lib/jets3t-0.9.0.jar:/usr/hdp/current/hbase-client/lib/microsoft-windowsazure-storage-sdk-0.6.0.jar:/usr/hdp/current/hbase-client/lib/hbase-examples-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jettison-1.3.1.jar:/usr/hdp/current/hbase-client/lib/commons-math3-3.1.1.jar:/usr/hdp/current/hbase-client/lib/jaxb-api-2.2.2.jar:/usr/hdp/current/hbase-client/lib/javax.inject-1.jar:/usr/hdp/current/hbase-client/lib/findbugs-annotations-1.3.9-1.jar:/usr/hdp/current/hbase-client/lib/mysql-connector-java.jar:/usr/hdp/current/hbase-client/lib/hbase-server-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/hbase-common-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jaxb-impl-2.2.3-1.jar:/usr/hdp/current/hbase-client/lib/jackson-xc-1.9.13.jar:/usr/hdp/current/hbase-client/lib/curator-client-2.6.0.jar:/usr/hdp/current/hbase-client/lib/asm-3.1.jar:/usr/hdp/current/hbase-client/lib/jackson-jaxrs-1.9.13.jar:/usr/hdp/current/hbase-client/lib/hbase-thrift-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jackson-core-asl-1.9.13.jar:/usr/hdp/current/hbase-client/lib/commons-cli-1.2.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-cred-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/java-xmlbuilder-0.4.jar:/usr/hdp/current/hbase-client/lib/jsp-2.1-6.1.14.jar:/usr/hdp/current/hbase-client/lib/hbase-prefix-tree.jar:/usr/hdp/current/hbase-client/lib/commons-beanutils-core-1.8.0.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop2-compat-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/hbase-it-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/libthrift-0.9.0.jar:/usr/hdp/current/hbase-client/lib/commons-collections-3.2.1.jar:/usr/hdp/current/hbase-client/lib/jruby-complete-1.6.8.jar:/usr/hdp/current/hbase-client/lib/jetty-util-6.1.26.jar:/usr/hdp/current/hbase-client/lib/apacheds-i18n-2.0.0-M15.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-impl-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/log4j-1.2.17.jar:/usr/hdp/current/hbase-client/lib/jersey-json-1.8.jar:/usr/hdp/current/hbase-client/lib/hbase-examples.jar:/usr/hdp/current/hbase-client/lib/hbase-it-0.98.4.2.2.0.0-2041-hadoop2-tests.jar:/usr/hdp/current/hbase-client/lib/xz-1.0.jar:/usr/hdp/current/hbase-client/lib/jsr305-1.3.9.jar:/usr/hdp/current/hbase-client/lib/hbase-thrift.jar:/usr/hdp/current/hbase-client/lib/guice-3.0.jar:/usr/hdp/current/hbase-client/lib/netty-3.6.6.Final.jar:/usr/hdp/current/hbase-client/lib/hbase-common-0.98.4.2.2.0.0-2041-hadoop2-tests.jar:/usr/hdp/current/hbase-client/lib/high-scale-lib-1.1.1.jar:/usr/hdp/current/hbase-client/lib/avro-1.7.4.jar:/usr/hdp/current/hbase-client/lib/httpcore-4.1.3.jar:/usr/hdp/current/hbase-client/lib/commons-logging-1.1.1.jar:/usr/hdp/current/hbase-client/lib/hbase-client-0.98.4.2.2.0.0-2041-hadoop2.jar:/usr/hdp/current/hbase-client/lib/jsp-api-2.1-6.1.14.jar:/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/junit-4.11.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/ranger-hbase-plugin-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hbase-client/lib/commons-net-3.1.jar:/usr/hdp/current/hbase-client/lib/snappy-java-1.0.4.1.jar:/usr/hdp/current/hbase-client/lib/activation-1.1.jar:/usr/hdp/current/hbase-client/lib/ranger-plugins-audit-0.4.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.jar:/usr/hdp/current/hadoop-mapreduce-client/curator-framework-2.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/metrics-core-3.0.1.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-lang-2.6.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-io-2.4.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-common-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/servlet-api-2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/gson-2.2.4.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-sls.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-distcp-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-mapper-asl-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/api-asn1-api-1.0.0-M20.jar:/usr/hdp/current/hadoop-mapreduce-client/jasper-runtime-5.5.23.jar:/usr/hdp/current/hadoop-mapreduce-client/jsch-0.1.42.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-auth-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/asm-3.2.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-httpclient-3.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-openstack.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-databind-2.2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/jersey-core-1.9.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-ant-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/mockito-all-1.8.5.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-2.6.0.2.2.0.0-2041-tests.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-digester-1.8.jar:/usr/hdp/current/hadoop-mapreduce-client/joda-time-2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/hamcrest-core-1.3.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-datajoin.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-ant.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-configuration-1.6.jar:/usr/hdp/current/hadoop-mapreduce-client/jersey-json-1.9.jar:/usr/hdp/current/hadoop-mapreduce-client/jetty-6.1.26.hwx.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-auth.jar:/usr/hdp/current/hadoop-mapreduce-client/aws-java-sdk-1.7.4.jar:/usr/hdp/current/hadoop-mapreduce-client/jsp-api-2.1.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-el-1.0.jar:/usr/hdp/current/hadoop-mapreduce-client/xmlenc-0.52.jar:/usr/hdp/current/hadoop-mapreduce-client/stax-api-1.0-2.jar:/usr/hdp/current/hadoop-mapreduce-client/curator-recipes-2.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-aws.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-common.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar:/usr/hdp/current/hadoop-mapreduce-client/jetty-util-6.1.26.hwx.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-distcp.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-archives-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-aws-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-beanutils-1.7.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jasper-compiler-5.5.23.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/api-util-1.0.0-M20.jar:/usr/hdp/current/hadoop-mapreduce-client/protobuf-java-2.5.0.jar:/usr/hdp/current/hadoop-mapreduce-client/httpclient-4.2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-app.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-sls-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/htrace-core-3.0.4.jar:/usr/hdp/current/hadoop-mapreduce-client/paranamer-2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-core-2.2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-compress-1.4.1.jar:/usr/hdp/current/hadoop-mapreduce-client/jets3t-0.9.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-gridmix-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/microsoft-windowsazure-storage-sdk-0.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-math3-3.1.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-rumen-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jaxb-api-2.2.2.jar:/usr/hdp/current/hadoop-mapreduce-client/jettison-1.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jaxb-impl-2.2.3-1.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-xc-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/curator-client-2.6.0.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-jaxrs-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-core-asl-1.9.13.jar:/usr/hdp/current/hadoop-mapreduce-client/httpcore-4.2.5.jar:/usr/hdp/current/hadoop-mapreduce-client/guava-11.0.2.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-cli-1.2.jar:/usr/hdp/current/hadoop-mapreduce-client/zookeeper-3.4.6.jar:/usr/hdp/current/hadoop-mapreduce-client/jackson-annotations-2.2.3.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-datajoin-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/jersey-server-1.9.jar:/usr/hdp/current/hadoop-mapreduce-client/java-xmlbuilder-0.4.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-beanutils-core-1.8.0.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-archives.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-collections-3.2.1.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-codec-1.4.jar:/usr/hdp/current/hadoop-mapreduce-client/apacheds-i18n-2.0.0-M15.jar:/usr/hdp/current/hadoop-mapreduce-client/log4j-1.2.17.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-extras-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.jar:/usr/hdp/current/hadoop-mapreduce-client/xz-1.0.jar:/usr/hdp/current/hadoop-mapreduce-client/jsr305-1.3.9.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-gridmix.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-app-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/netty-3.6.2.Final.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core.jar:/usr/hdp/current/hadoop-mapreduce-client/avro-1.7.4.jar:/usr/hdp/current/hadoop-mapreduce-client/junit-4.11.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core-2.6.0.2.2.0.0-2041.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-logging-1.1.3.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-extras.jar:/usr/hdp/current/hadoop-mapreduce-client/commons-net-3.1.jar:/usr/hdp/current/hadoop-mapreduce-client/snappy-java-1.0.4.1.jar:/usr/hdp/current/hadoop-mapreduce-client/activation-1.1.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-rumen.jar:/usr/hdp/current/hadoop-mapreduce-client/hadoop-openstack-2.6.0.2.2.0.0-2041.jar:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/conf:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar:/etc/hadoop/conf:/etc/hadoop/conf
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.io.tmpdir=/tmp
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.compiler=<NA>
2015-04-14 22:41:56,439 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.name=Linux
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.arch=amd64
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.version=2.6.32-504.8.1.el6.x86_64
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.name=root
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.home=/root
2015-04-14 22:41:56,440 INFO [Thread-2] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.dir=/root/hbase
2015-04-14 22:41:56,441 INFO [Thread-2] zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=hadoop-node02.mathartsys.com:2181,hadoop-node01.mathartsys.com:2181,hadoop-node03.mathartsys.com:2181 sessionTimeout=30000 watcher=hconnection-0x560cb988, quorum=hadoop-node02.mathartsys.com:2181,hadoop-node01.mathartsys.com:2181,hadoop-node03.mathartsys.com:2181, baseZNode=/hbase-unsecure
2015-04-14 22:41:56,458 INFO [Thread-2] zookeeper.RecoverableZooKeeper (RecoverableZooKeeper.java:<init>(120)) - Process identifier=hconnection-0x560cb988 connecting to ZooKeeper ensemble=hadoop-node02.mathartsys.com:2181,hadoop-node01.mathartsys.com:2181,hadoop-node03.mathartsys.com:2181
2015-04-14 22:41:56,460 INFO [Thread-2-SendThread(hadoop-node02.mathartsys.com:2181)] zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(966)) - Opening socket connection to server hadoop-node02.mathartsys.com/10.0.0.222:2181. Will not attempt to authenticate using SASL (unknown error)
2015-04-14 22:41:56,461 INFO [Thread-2-SendThread(hadoop-node02.mathartsys.com:2181)] zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(849)) - Socket connection established to hadoop-node02.mathartsys.com/10.0.0.222:2181, initiating session
2015-04-14 22:41:56,491 INFO [Thread-2-SendThread(hadoop-node02.mathartsys.com:2181)] zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1207)) - Session establishment complete on server hadoop-node02.mathartsys.com/10.0.0.222:2181, sessionid = 0x24cb25197440023, negotiated timeout = 30000
2015-04-14 22:41:56,605 INFO [Thread-2] util.RegionSizeCalculator (RegionSizeCalculator.java:<init>(76)) - Calculating region sizes for table "test".
2015-04-14 22:41:56,984 WARN [Thread-2] mapreduce.TableInputFormatBase (TableInputFormatBase.java:getSplits(193)) - Cannot resolve the host name for hadoop-node05.mathartsys.com/10.0.0.225 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '225.0.0.10.in-addr.arpa'
2015-04-14 22:41:57,013 INFO [Thread-2] spark.DefaultExecutionContext (Logging.scala:logInfo(59)) - Starting job: first at SerDeUtil.scala:202
......
2015-04-14 22:41:57,107 INFO [sparkDriver-akka.actor.default-dispatcher-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.0 in stage 0.0 (TID 0, hadoop-node04.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,216 WARN [task-result-getter-0] scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.0 in stage 0.0 (TID 0, hadoop-node04.mathartsys.com): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-04-14 22:41:57,220 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.1 in stage 0.0 (TID 1, hadoop-node06.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,303 INFO [task-result-getter-1] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.1 in stage 0.0 (TID 1) on executor hadoop-node06.mathartsys.com: java.lang.IllegalStateException (unread block data) [duplicate 1]
2015-04-14 22:41:57,306 INFO [sparkDriver-akka.actor.default-dispatcher-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.2 in stage 0.0 (TID 2, hadoop-node04.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,327 INFO [task-result-getter-2] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.2 in stage 0.0 (TID 2) on executor hadoop-node04.mathartsys.com: java.lang.IllegalStateException (unread block data) [duplicate 2]
2015-04-14 22:41:57,330 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.3 in stage 0.0 (TID 3, hadoop-node06.mathartsys.com, RACK_LOCAL, 1312 bytes)
2015-04-14 22:41:57,347 INFO [task-result-getter-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.3 in stage 0.0 (TID 3) on executor hadoop-node06.mathartsys.com: java.lang.IllegalStateException (unread block data) [duplicate 3]
2015-04-14 22:41:57,348 ERROR [task-result-getter-3] scheduler.TaskSetManager (Logging.scala:logError(75)) - Task 0 in stage 0.0 failed 4 times; aborting job
2015-04-14 22:41:57,350 INFO [task-result-getter-3] cluster.YarnClientClusterScheduler (Logging.scala:logInfo(59)) - Removed TaskSet 0.0, whose tasks have all completed, from pool
2015-04-14 22:41:57,353 INFO [sparkDriver-akka.actor.default-dispatcher-4] cluster.YarnClientClusterScheduler (Logging.scala:logInfo(59)) - Cancelling stage 0
2015-04-14 22:41:57,357 INFO [Thread-2] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Job 0 failed: first at SerDeUtil.scala:202, took 0.343391 s
Traceback (most recent call last):
File "/root/hbase/hbase_test2.py", line 24, in <module>
conf=conf)
File "/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/python/pyspark/context.py", line 530, in newAPIHadoopRDD
jconf, batchSize)
File "/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, hadoop-node06.mathartsys.com): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-examples-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/spark-1.2.0.2.2.0.0-82-bin-2.6.0.2.2.0.0-2041/lib/spark-assembly-1.2.0.2.2.0.0-82-hadoop2.6.0.2.2.0.0-2041.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[root#hadoop-node03 hbase]#

Resources