Recently I'm trying wordaccount through MapReduce in Hadoop2.7.1. But the job always stuck at map 0% reduce 0%. Here is all the information:
No configs found; falling back on auto-configuration
No configs specified for hadoop runner
Looking for hadoop binary in /usr/local/hadoop/bin...
Found hadoop binary: /usr/local/hadoop/bin/hadoop
Using Hadoop version 2.7.1
Looking for Hadoop streaming jar in /usr/local/hadoop...
Found Hadoop streaming jar: /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar
Creating temp directory /tmp/wordaccount.xjj.20220524.013439.681080
uploading working dir files to hdfs:///user/xjj/tmp/mrjob/wordaccount.xjj.20220524.013439.681080/files/wd...
Copying other local files to hdfs:///user/xjj/tmp/mrjob/wordaccount.xjj.20220524.013439.681080/files/
Running step 1 of 1...
packageJobJar: [/tmp/hadoop-unjar3955585943094314924/] [] /tmp/streamjob2959762167969354976.jar tmpDir=null
Connecting to ResourceManager at /0.0.0.0:8032
Connecting to ResourceManager at /0.0.0.0:8032
Total input paths to process : 1
number of splits:2
Submitting tokens for job: job_1653356019342_0001
Submitted application application_1653356019342_0001
The url to track the job: http://master:8088/proxy/application_1653356019342_0001/
Running job: job_1653356019342_0001
Job job_1653356019342_0001 running in uber mode : false
map 0% reduce 0%
I entered the url and check the job, the content of the url is here enter image description here
Then I checked the resourcemanager-master.log:
2022-05-24 09:47:09,400 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Error cleaning master
java.net.ConnectException: Call From master/192.168.70.128 to master:36309 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy32.stopContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.stopContainers(ContainerManagementProtocolPBClientImpl.java:110)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.cleanup(AMLauncher.java:139)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529)
at org.apache.hadoop.ipc.Client.call(Client.java:1446)
... 9 more
2022-05-24 09:49:03,136 INFO logs: Aliases are enabled
and the nodemanager-master.log:
2022-05-24 09:35:00,684 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1653356019342_0001
2022-05-24 09:35:00,684 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1653356019342_0001
2022-05-24 09:35:00,684 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle
2022-05-24 09:35:00,694 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1653356019342_0001
2022-05-24 09:35:00,697 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1653356019342_0001
2022-05-24 09:35:00,697 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1653356019342_0001
2022-05-24 09:35:00,697 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle
2022-05-24 09:35:00,697 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1653356019342_0001
2022-05-24 09:35:00,698 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1653356019342_0001_01_000003 transitioned from LOCALIZING to LOCALIZED
2022-05-24 09:35:00,698 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1653356019342_0001_01_000002 transitioned from LOCALIZING to LOCALIZED
2022-05-24 09:35:00,735 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1653356019342_0001_01_000002 transitioned from LOCALIZED to RUNNING
2022-05-24 09:35:00,735 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory monitoring is needed. Not running the monitor-thread
2022-05-24 09:35:00,737 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1653356019342_0001_01_000003 transitioned from LOCALIZED to RUNNING
2022-05-24 09:35:00,737 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory monitoring is needed. Not running the monitor-thread
2022-05-24 09:35:00,743 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /usr/local/hadoop/tmp/nm-local-dir/usercache/xjj/appcache/application_1653356019342_0001/container_1653356019342_0001_01_000002/default_container_executor.sh]
2022-05-24 09:35:00,744 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /usr/local/hadoop/tmp/nm-local-dir/usercache/xjj/appcache/application_1653356019342_0001/container_1653356019342_0001_01_000003/default_container_executor.sh]
So what could be the problem? Connection refused or not enough memory? Thanks for your help.
I am trying to run spring example program. After successfull compilation using sbt I am using this following command:
spark-submit --class
org.apache.spark.examples.streaming.DirectKafkaWordCount --jars
/home/hadoop/jars/kafka_2.11-0.10.2.0.jar,/home/hadoop/jars/spark-streaming_2.11-2.1.0.jar,/home/hadoop/jars/spark-streaming-kafka-0-8_2.11-2.1.0.jar,/home/hadoop/jars/kafka-clients-0.10.0.0.jar,/home/hadoop/jars/metrics-core-2.2.0.jar
--master local[2] ../streamkafkaprog_2.11-1.0.jar localhost:9092 MyTopic
But I am getting :
17/03/28 02:15:35 INFO NettyBlockTransferService: Server created on 127.0.0.1:57453
17/03/28 02:15:35 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/03/28 02:15:35 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 57453, None)
17/03/28 02:15:35 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:57453 with 413.9 MB RAM, BlockManagerId(driver, 127.0.0.1, 57453, None)
17/03/28 02:15:35 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 57453, None)
17/03/28 02:15:35 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 57453, None)
17/03/28 02:15:36 INFO VerifiableProperties: Verifying properties
17/03/28 02:15:36 INFO VerifiableProperties: Property group.id is overridden to
17/03/28 02:15:36 INFO VerifiableProperties: Property zookeeper.connect is overridden to
17/03/28 02:15:36 INFO SimpleConsumer: Reconnect due to error:
java.lang.NoSuchMethodError: org.apache.kafka.common.network.NetworkSend.(Ljava/lang/String;Ljava/nio/ByteBuffer;)V
at kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:41)
at kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:44)
at kafka.network.BlockingChannel.send(BlockingChannel.scala:112)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:85)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:83)
at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:111)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:133)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:132)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:365)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:361)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:361)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:132)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:119)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:211)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at org.apache.spark.examples.streaming.DirectKafkaWordCount$.main(DitectKafkaProg.scala:60)
at org.apache.spark.examples.streaming.DirectKafkaWordCount.main(DitectKafkaProg.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.kafka.common.network.NetworkSend.(Ljava/lang/String;Ljava/nio/ByteBuffer;)V
at kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:41)
at kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:44)
at kafka.network.BlockingChannel.send(BlockingChannel.scala:112)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:98)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:83)
at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:111)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:133)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:132)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:365)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:361)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:361)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:132)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:119)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:211)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at org.apache.spark.examples.streaming.DirectKafkaWordCount$.main(DitectKafkaProg.scala:60)
at org.apache.spark.examples.streaming.DirectKafkaWordCount.main(DitectKafkaProg.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/03/28 02:15:36 INFO SparkContext: Invoking stop() from shutdown hook
17/03/28 02:15:36 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040
17/03/28 02:15:36 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/03/28 02:15:36 INFO MemoryStore: MemoryStore cleared
17/03/28 02:15:36 INFO BlockManager: BlockManager stopped
17/03/28 02:15:36 INFO BlockManagerMaster: BlockManagerMaster stopped
17/03/28 02:15:36 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/03/28 02:15:36 INFO SparkContext: Successfully stopped SparkContext
17/03/28 02:15:36 INFO ShutdownHookManager: Shutdown hook called
17/03/28 02:15:36 INFO ShutdownHookManager: Deleting directory /tmp/spark-5f636ca8-a2f8-4185-a2d0-a31757d2e721
I am using spark 2.1.0,kafka_2.11.0.10.0.0,spark-streaming-kafka-0-8_2.11:2.1.0 and scala-2.11.7.
Please Help me.
Thanks
En, hello everyone a quest troubled me a long time. I can run my spark app in standalone mode by this command
spark-submit --master spark://fuxiuyin-virtual-machine:7077 test_app.py
But this app fail to run in yarn cluster by this command
spark-submit --master yarn test_app.py
I think my yarn cluster is healthy.
The output of jps is
$ jps
8289 Worker
14882 NameNode
15475 ResourceManager
8134 Master
15751 NodeManager
15063 DataNode
17212 Jps
15295 SecondaryNameNode
And the 'Nodes of the cluster' page is
here
The output of spark-submit is
$ /opt/spark/bin/spark-submit --master yarn test_app.py
16/10/28 16:54:39 INFO spark.SparkContext: Running Spark version 2.0.1
16/10/28 16:54:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/28 16:54:39 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/28 16:54:39 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/28 16:54:39 INFO spark.SecurityManager: Changing view acls groups to:
16/10/28 16:54:39 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/28 16:54:39 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/28 16:54:39 INFO util.Utils: Successfully started service 'sparkDriver' on port 42519.
16/10/28 16:54:39 INFO spark.SparkEnv: Registering MapOutputTracker
16/10/28 16:54:39 INFO spark.SparkEnv: Registering BlockManagerMaster
16/10/28 16:54:39 INFO storage.DiskBlockManager: Created local directory at /opt/spark/blockmgr-1dcd1d1a-4cf4-4778-9b71-53e238a62c97
16/10/28 16:54:39 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
16/10/28 16:54:40 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/10/28 16:54:40 INFO util.log: Logging initialized #1843ms
16/10/28 16:54:40 INFO server.Server: jetty-9.2.z-SNAPSHOT
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#1b933891{/jobs,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#580d9060{/jobs/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3a8fb3d9{/jobs/job,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#744ecb1b{/jobs/job/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#761b32b3{/stages,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#42213280{/stages/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#5775066{/stages/stage,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7e355c0{/stages/stage/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#28426125{/stages/pool,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#63bcf39f{/stages/pool/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#5cf77bee{/storage,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#412768e5{/storage/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7ad772ad{/storage/rdd,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7ef35663{/storage/rdd/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#193c7a58{/environment,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#63a649da{/environment/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#22251d19{/executors,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#46810770{/executors/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3c155b42{/executors/threadDump,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#6dac2d83{/executors/threadDump/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#67eb38fa{/static,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#291f19f0{/,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3f4688da{/api,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#338a7a84{/stages/stage/kill,null,AVAILABLE}
16/10/28 16:54:40 INFO server.ServerConnector: Started ServerConnector#7df0e73{HTTP/1.1}{fuxiuyin-virtual-machine:4040}
16/10/28 16:54:40 INFO server.Server: Started #1962ms
16/10/28 16:54:40 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/10/28 16:54:40 INFO ui.SparkUI: Bound SparkUI to fuxiuyin-virtual-machine, and started at http://192.168.102.133:4040
16/10/28 16:54:40 INFO client.RMProxy: Connecting to ResourceManager at fuxiuyin-virtual-machine/192.168.102.133:8032
16/10/28 16:54:41 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/10/28 16:54:41 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/10/28 16:54:41 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/10/28 16:54:41 INFO yarn.Client: Setting up container launch context for our AM
16/10/28 16:54:41 INFO yarn.Client: Setting up the launch environment for our AM container
16/10/28 16:54:41 INFO yarn.Client: Preparing resources for our AM container
16/10/28 16:54:41 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/10/28 16:54:42 INFO yarn.Client: Uploading resource file:/opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/__spark_libs__697818607740390689.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/__spark_libs__697818607740390689.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/python/lib/pyspark.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/pyspark.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/python/lib/py4j-0.10.3-src.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/py4j-0.10.3-src.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/__spark_conf__7760765070208746118.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/__spark_conf__.zip
16/10/28 16:54:45 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/28 16:54:45 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/28 16:54:45 INFO spark.SecurityManager: Changing view acls groups to:
16/10/28 16:54:45 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/28 16:54:45 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/28 16:54:45 INFO yarn.Client: Submitting application application_1477644823180_0001 to ResourceManager
16/10/28 16:54:45 INFO impl.YarnClientImpl: Submitted application application_1477644823180_0001
16/10/28 16:54:45 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1477644823180_0001 and attemptId None
16/10/28 16:54:46 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:46 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1477644885891
final status: UNDEFINED
tracking URL: http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001/
user: fuxiuyin
16/10/28 16:54:47 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:48 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:49 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:50 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:51 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:52 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/10/28 16:54:52 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> fuxiuyin-virtual-machine, PROXY_URI_BASES -> http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001), /proxy/application_1477644823180_0001
16/10/28 16:54:52 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/10/28 16:54:52 INFO yarn.Client: Application report for application_1477644823180_0001 (state: RUNNING)
16/10/28 16:54:52 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.102.133
ApplicationMaster RPC port: 0
queue: default
start time: 1477644885891
final status: UNDEFINED
tracking URL: http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001/
user: fuxiuyin
16/10/28 16:54:52 INFO cluster.YarnClientSchedulerBackend: Application application_1477644823180_0001 has started running.
16/10/28 16:54:52 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39951.
16/10/28 16:54:52 INFO netty.NettyBlockTransferService: Server created on 192.168.102.133:39951
16/10/28 16:54:53 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.102.133:39951 with 366.3 MB RAM, BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#43ba5458{/metrics/json,null,AVAILABLE}
16/10/28 16:54:57 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/10/28 16:54:57 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> fuxiuyin-virtual-machine, PROXY_URI_BASES -> http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001), /proxy/application_1477644823180_0001
16/10/28 16:54:57 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/10/28 16:54:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.102.133:45708) with ID 1
16/10/28 16:54:59 INFO storage.BlockManagerMasterEndpoint: Registering block manager fuxiuyin-virtual-machine:33074 with 366.3 MB RAM, BlockManagerId(1, fuxiuyin-virtual-machine, 33074)
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.102.133:45712) with ID 2
16/10/28 16:55:00 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/10/28 16:55:00 INFO storage.BlockManagerMasterEndpoint: Registering block manager fuxiuyin-virtual-machine:43740 with 366.3 MB RAM, BlockManagerId(2, fuxiuyin-virtual-machine, 43740)
16/10/28 16:55:00 INFO spark.SparkContext: Starting job: collect at /home/fuxiuyin/test_app.py:8
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Got job 0 (collect at /home/fuxiuyin/test_app.py:8) with 2 output partitions
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (collect at /home/fuxiuyin/test_app.py:8)
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Missing parents: List()
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at collect at /home/fuxiuyin/test_app.py:8), which has no missing parents
16/10/28 16:55:00 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.8 KB, free 366.3 MB)
16/10/28 16:55:00 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 366.3 MB)
16/10/28 16:55:00 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.102.133:39951 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:00 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (PythonRDD[1] at collect at /home/fuxiuyin/test_app.py:8)
16/10/28 16:55:00 INFO cluster.YarnScheduler: Adding task set 0.0 with 2 tasks
16/10/28 16:55:00 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, fuxiuyin-virtual-machine, partition 0, PROCESS_LOCAL, 5450 bytes)
16/10/28 16:55:00 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, fuxiuyin-virtual-machine, partition 1, PROCESS_LOCAL, 5469 bytes)
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: fuxiuyin-virtual-machine.
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: fuxiuyin-virtual-machine.
16/10/28 16:55:01 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/10/28 16:55:01 INFO server.ServerConnector: Stopped ServerConnector#7df0e73{HTTP/1.1}{fuxiuyin-virtual-machine:4040}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#338a7a84{/stages/stage/kill,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3f4688da{/api,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#291f19f0{/,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#67eb38fa{/static,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#6dac2d83{/executors/threadDump/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3c155b42{/executors/threadDump,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#46810770{/executors/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#22251d19{/executors,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#63a649da{/environment/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#193c7a58{/environment,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7ef35663{/storage/rdd/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7ad772ad{/storage/rdd,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#412768e5{/storage/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#5cf77bee{/storage,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#63bcf39f{/stages/pool/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#28426125{/stages/pool,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7e355c0{/stages/stage/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#5775066{/stages/stage,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#42213280{/stages/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#761b32b3{/stages,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#744ecb1b{/jobs/job/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3a8fb3d9{/jobs/job,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#580d9060{/jobs/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#1b933891{/jobs,null,UNAVAILABLE}
16/10/28 16:55:01 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.102.133:4040
16/10/28 16:55:01 INFO scheduler.DAGScheduler: Job 0 failed: collect at /home/fuxiuyin/test_app.py:8, took 0.383872 s
16/10/28 16:55:01 INFO scheduler.DAGScheduler: ResultStage 0 (collect at /home/fuxiuyin/test_app.py:8) failed in 0.233 s
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo#469337f1)
Traceback (most recent call last):
File "/home/fuxiuyin/test_app.py", line 8, in <module>
print(data.collect())
File "/opt/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 776, in collect
File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1477644901073,JobFailed(org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down))
: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:818)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:816)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:816)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1685)
at org.apache.spark.util.EventLoop.stop(EventLoop.scala:83)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1604)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1798)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1797)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1916)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1930)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:453)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
16/10/28 16:55:01 ERROR client.TransportClient: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/10/28 16:55:01 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/10/28 16:55:01 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
16/10/28 16:55:01 ERROR util.Utils: Uncaught exception in thread Yarn application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:508)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:93)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:151)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:455)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1605)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1798)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1797)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
Caused by: java.io.IOException: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/10/28 16:55:01 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/28 16:55:01 INFO storage.DiskBlockManager: Shutdown hook called
16/10/28 16:55:01 INFO util.ShutdownHookManager: Shutdown hook called
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/userFiles-f51df2cd-8ec0-4caa-862f-77db0cc72505
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/pyspark-5216f977-d3c3-495f-b91a-88fa2218696d
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898
16/10/28 16:55:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on fuxiuyin-virtual-machine:43740 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(2, fuxiuyin-virtual-machine, 43740),broadcast_0_piece0,StorageLevel(memory, 1 replicas),2517,0))
16/10/28 16:55:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on fuxiuyin-virtual-machine:33074 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(1, fuxiuyin-virtual-machine, 33074),broadcast_0_piece0,StorageLevel(memory, 1 replicas),2517,0))
16/10/28 16:55:01 INFO memory.MemoryStore: MemoryStore cleared
16/10/28 16:55:01 INFO storage.BlockManager: BlockManager stopped
And the log of yarn resourcemanager is in
yarn-fuxiuyin-resourcemanager-fuxiuyin-virtual-machine.log
I submit app by this user:
uid=1000(fuxiuyin) gid=1000(fuxiuyin) 组=1000(fuxiuyin),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),108(lpadmin),124(sambashare)
My test_app is
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName("test_app")
sc = SparkContext(conf=conf)
data = sc.parallelize([1, 2, 3])
data = data.map(lambda x: x + 1)
print(data.collect())
I don't how to fix it.
Thinks.
The driver has to collect all the data from worker nodes before printing so use the code below..
i think the error is due to
print(data.collect())
use
for x in data.collect():
print x
and use spark submit as:
spark-submit --master yarn deploy-mode cluster test_app.py
instead of spark-submit --master yarn test_app.py
try this command spark-submit --master yarn-client test_app.py
I've installed spark-1.6.1-bin-hadoop2.6.tgz on a 15-node Hadoop cluster. All nodes run Java 1.8.0_72 and the latest version of Hadoop. The Hadoop cluster itself is functional, e.g. YARN can run various MapReduce jobs successfully.
I can run Spark Shell locally on a node without any problems with the following command: $SPARK_HOME/bin/spark-shell.
I can also run some Spark examples successfully, such as SparkPi using YARN and cluster mode.
But when I try to run Spark Shell on YARN with deploy mode client, I encounter problems:
hadoopu#hadoop2:~$ $SPARK_HOME/bin/spark-shell --master yarn --deploy-mode client
16/03/21 15:15:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
...
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_72)
Type in expressions to have them evaluated.
Type :help for more information.
...
16/03/21 15:15:24 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
16/03/21 15:15:24 INFO SparkEnv: Registering OutputCommitCoordinator
16/03/21 15:15:24 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/03/21 15:15:24 INFO SparkUI: Started SparkUI at http://10.108.57.32:4040
16/03/21 15:15:24 INFO RMProxy: Connecting to ResourceManager at hadoop2/10.108.57.32:8032
16/03/21 15:15:24 INFO Client: Requesting a new application from cluster with 13 NodeManagers
16/03/21 15:15:25 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (131072 MB per container)
16/03/21 15:15:25 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/03/21 15:15:25 INFO Client: Setting up container launch context for our AM
16/03/21 15:15:25 INFO Client: Setting up the launch environment for our AM container
16/03/21 15:15:25 INFO Client: Preparing resources for our AM container
16/03/21 15:15:25 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/03/21 15:15:25 INFO Client: Uploading resource file:/opt/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar -> hdfs://hadoop1:9000/user/hadoopu/.sparkStaging/application_1458568053208_0006/spark-assembly-1.6.1-hadoop2.6.0.jar
16/03/21 15:15:28 INFO Client: Uploading resource file:/tmp/spark-c9077c60-b379-439e-aeb4-85948df70df5/__spark_conf__7479505398141092205.zip -> hdfs://hadoop1:9000/user/hadoopu/.sparkStaging/application_1458568053208_0006/__spark_conf__7479505398141092205.zip
16/03/21 15:15:28 INFO SecurityManager: Changing view acls to: hadoopu
16/03/21 15:15:28 INFO SecurityManager: Changing modify acls to: hadoopu
16/03/21 15:15:28 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoopu); users with modify permissions: Set(hadoopu)
16/03/21 15:15:28 INFO Client: Submitting application 6 to ResourceManager
16/03/21 15:15:28 INFO YarnClientImpl: Submitted application application_1458568053208_0006
16/03/21 15:15:29 INFO Client: Application report for application_1458568053208_0006 (state: ACCEPTED)
16/03/21 15:15:29 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458569728506
final status: UNDEFINED
tracking URL: http://hadoop2:8088/proxy/application_1458568053208_0006/
user: hadoopu
16/03/21 15:15:30 INFO Client: Application report for application_1458568053208_0006 (state: ACCEPTED)
16/03/21 15:15:31 INFO Client: Application report for application_1458568053208_0006 (state: ACCEPTED)
16/03/21 15:15:32 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/03/21 15:15:32 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoop2, PROXY_URI_BASES -> http://hadoop2:8088/proxy/application_1458568053208_0006), /proxy/application_1458568053208_0006
16/03/21 15:15:32 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/21 15:15:32 INFO Client: Application report for application_1458568053208_0006 (state: RUNNING)
16/03/21 15:15:32 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.108.57.41
ApplicationMaster RPC port: 0
queue: default
start time: 1458569728506
final status: UNDEFINED
tracking URL: http://hadoop2:8088/proxy/application_1458568053208_0006/
user: hadoopu
16/03/21 15:15:32 INFO YarnClientSchedulerBackend: Application application_1458568053208_0006 has started running.
16/03/21 15:15:32 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 50170.
16/03/21 15:15:32 INFO NettyBlockTransferService: Server created on 50170
16/03/21 15:15:32 INFO BlockManagerMaster: Trying to register BlockManager
16/03/21 15:15:32 INFO BlockManagerMasterEndpoint: Registering block manager 10.108.57.32:50170 with 511.1 MB RAM, BlockManagerId(driver, 10.108.57.32, 50170)
16/03/21 15:15:32 INFO BlockManagerMaster: Registered BlockManager
16/03/21 15:15:37 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/03/21 15:15:37 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hadoop2, PROXY_URI_BASES -> http://hadoop2:8088/proxy/application_1458568053208_0006), /proxy/application_1458568053208_0006
16/03/21 15:15:37 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/21 15:15:39 ERROR YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/03/21 15:15:39 INFO SparkUI: Stopped Spark web UI at http://10.108.57.32:4040
16/03/21 15:15:39 INFO YarnClientSchedulerBackend: Shutting down all executors
16/03/21 15:15:39 INFO YarnClientSchedulerBackend: Asking each executor to shut down
16/03/21 15:15:39 INFO YarnClientSchedulerBackend: Stopped
16/03/21 15:15:39 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/21 15:15:39 INFO MemoryStore: MemoryStore cleared
16/03/21 15:15:39 INFO BlockManager: BlockManager stopped
16/03/21 15:15:39 INFO BlockManagerMaster: BlockManagerMaster stopped
16/03/21 15:15:39 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/03/21 15:15:39 INFO SparkContext: Successfully stopped SparkContext
16/03/21 15:15:39 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/03/21 15:15:39 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/03/21 15:15:39 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/03/21 15:15:54 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/03/21 15:15:54 ERROR SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
at $line3.$read$$iwC$$iwC.<init>(<console>:15)
at $line3.$read$$iwC.<init>(<console>:24)
at $line3.$read.<init>(<console>:26)
at $line3.$read$.<init>(<console>:30)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.<init>(<console>:7)
at $line3.$eval$.<clinit>(<console>)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/03/21 15:15:54 INFO SparkContext: SparkContext already stopped.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
at $iwC$$iwC.<init>(<console>:15)
at $iwC.<init>(<console>:24)
at <init>(<console>:26)
at .<init>(<console>:30)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
java.lang.NullPointerException
at org.apache.spark.sql.SQLContext$.createListenerAndUI(SQLContext.scala:1367)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
at $iwC$$iwC.<init>(<console>:15)
at $iwC.<init>(<console>:24)
at <init>(<console>:26)
at .<init>(<console>:30)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:132)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
<console>:16: error: not found: value sqlContext
import sqlContext.implicits._
^
<console>:16: error: not found: value sqlContext
import sqlContext.sql
^
scala>
scala> sc
<console>:20: error: not found: value sc
sc
^
scala>
I've also went to the YARN Web UI, found the Spark Shell in the list of FINISHED applications, then clicked on the application to see the logs. I've found two nodes with stderr logs:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt/ssd1/tmp/nm-local-dir/usercache/hadoopu/filecache/13/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-3.0.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/03/21 15:07:20 INFO ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
16/03/21 15:07:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/21 15:07:21 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1458568053208_0005_000002
16/03/21 15:07:22 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/03/21 15:07:22 INFO SecurityManager: Changing view acls to: hadoopu
16/03/21 15:07:22 INFO SecurityManager: Changing modify acls to: hadoopu
16/03/21 15:07:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoopu); users with modify permissions: Set(hadoopu)
16/03/21 15:07:22 INFO ApplicationMaster: Waiting for Spark driver to be reachable.
16/03/21 15:07:22 INFO ApplicationMaster: Driver now available: 10.108.57.32:39824
16/03/21 15:07:22 INFO ApplicationMaster$AMEndpoint: Add WebUI Filter. AddWebUIFilter(org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter,Map(PROXY_HOSTS -> hadoop2, PROXY_URI_BASES -> http://hadoop2:8088/proxy/application_1458568053208_0005),/proxy/application_1458568053208_0005)
16/03/21 15:07:22 INFO RMProxy: Connecting to ResourceManager at hadoop2/10.108.57.32:8030
16/03/21 15:07:22 INFO YarnRMClient: Registering the ApplicationMaster
16/03/21 15:07:22 INFO YarnAllocator: Will request 2 executor containers, each with 1 cores and 1408 MB memory including 384 MB overhead
16/03/21 15:07:22 INFO YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
16/03/21 15:07:22 INFO YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
16/03/21 15:07:22 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
16/03/21 15:07:23 INFO AMRMClientImpl: Received new token for : hadoop14:32420
16/03/21 15:07:23 INFO AMRMClientImpl: Received new token for : hadoop3:35904
16/03/21 15:07:23 INFO YarnAllocator: Launching container container_1458568053208_0005_02_000002 for on host hadoop14
16/03/21 15:07:23 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler#10.108.57.32:39824, executorHostname: hadoop14
16/03/21 15:07:23 INFO YarnAllocator: Launching container container_1458568053208_0005_02_000003 for on host hadoop3
16/03/21 15:07:23 INFO ExecutorRunnable: Starting Executor Container
16/03/21 15:07:23 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler#10.108.57.32:39824, executorHostname: hadoop3
16/03/21 15:07:23 INFO ExecutorRunnable: Starting Executor Container
16/03/21 15:07:23 INFO YarnAllocator: Received 2 containers from YARN, launching executors on 2 of them.
16/03/21 15:07:23 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
16/03/21 15:07:23 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
16/03/21 15:07:23 INFO ExecutorRunnable: Setting up ContainerLaunchContext
16/03/21 15:07:23 INFO ExecutorRunnable: Setting up ContainerLaunchContext
16/03/21 15:07:23 INFO ExecutorRunnable: Preparing Local resources
16/03/21 15:07:23 INFO ExecutorRunnable: Preparing Local resources
16/03/21 15:07:23 INFO ExecutorRunnable: Prepared Local resources Map(__spark__.jar -> resource { scheme: "hdfs" host: "hadoop1" port: 9000 file: "/user/hadoopu/.sparkStaging/application_1458568053208_0005/spark-assembly-1.6.1-hadoop2.6.0.jar" } size: 187698038 timestamp: 1458569230874 type: FILE visibility: PRIVATE)
16/03/21 15:07:23 INFO ExecutorRunnable: Prepared Local resources Map(__spark__.jar -> resource { scheme: "hdfs" host: "hadoop1" port: 9000 file: "/user/hadoopu/.sparkStaging/application_1458568053208_0005/spark-assembly-1.6.1-hadoop2.6.0.jar" } size: 187698038 timestamp: 1458569230874 type: FILE visibility: PRIVATE)
16/03/21 15:07:23 INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_PREFIX/share/hadoop/tools/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
SPARK_LOG_URL_STDERR -> http://hadoop3:8042/node/containerlogs/container_1458568053208_0005_02_000003/hadoopu/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1458568053208_0005
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 187698038
SPARK_USER -> hadoopu
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1458569230874
SPARK_LOG_URL_STDOUT -> http://hadoop3:8042/node/containerlogs/container_1458568053208_0005_02_000003/hadoopu/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://hadoop1:9000/user/hadoopu/.sparkStaging/application_1458568053208_0005/spark-assembly-1.6.1-hadoop2.6.0.jar#__spark__.jar
command:
{{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=39824' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler#10.108.57.32:39824 --executor-id 2 --hostname hadoop3 --cores 1 --app-id application_1458568053208_0005 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
16/03/21 15:07:23 INFO ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_PREFIX/share/hadoop/tools/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
SPARK_LOG_URL_STDERR -> http://hadoop14:8042/node/containerlogs/container_1458568053208_0005_02_000002/hadoopu/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1458568053208_0005
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 187698038
SPARK_USER -> hadoopu
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1458569230874
SPARK_LOG_URL_STDOUT -> http://hadoop14:8042/node/containerlogs/container_1458568053208_0005_02_000002/hadoopu/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://hadoop1:9000/user/hadoopu/.sparkStaging/application_1458568053208_0005/spark-assembly-1.6.1-hadoop2.6.0.jar#__spark__.jar
command:
{{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=39824' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler#10.108.57.32:39824 --executor-id 1 --hostname hadoop14 --cores 1 --app-id application_1458568053208_0005 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
...
16/03/21 15:07:25 ERROR ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
16/03/21 15:07:25 INFO ApplicationMaster: Final app status: UNDEFINED, exitCode: 0, (reason: Shutdown hook called before final status was reported.)
16/03/21 15:07:25 INFO ApplicationMaster: Unregistering ApplicationMaster with UNDEFINED (diag message: Shutdown hook called before final status was reported.)
16/03/21 15:07:25 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
16/03/21 15:07:25 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1458568053208_0005
16/03/21 15:07:25 INFO ShutdownHookManager: Shutdown hook called
Any ideas why I can't run Spark Shell on YARN with client mode?
I had the same issue. It turned out to be a firewall between my login node and the cluster: the cluster was trying to connect back to the login node on a random port that was blocked. Either remove the firewall rules, or move your shell to one of the nodes of the cluster where there aren't any firewall rules that block access.
When I put the 'val lines = sc.textFile("hdfs:///input")' in yarn-client, 'Cannot call methods on a stopped SparkContext' error occur. I searched all day long for two days, but I don't know where is cause. "hdfs:///input" is right, because when I executed it in standalone mode, I worked well.
Could you give me a any idea of that?
I'm using spark 1.5.2, hadoop 2.7.2.
tarting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
192.168.111.203: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
192.168.111.202: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
[root#master spark-1.5.2-bin-hadoop2.6]# bin/spark-shell --master yarn-client
16/03/19 05:59:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 05:59:12 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:13 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:13 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:13 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:46780
16/03/19 05:59:13 INFO util.Utils: Successfully started service 'HTTP class server' on port 46780.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.2
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_73)
Type in expressions to have them evaluated.
Type :help for more information.
16/03/19 05:59:17 INFO spark.SparkContext: Running Spark version 1.5.2
16/03/19 05:59:17 WARN spark.SparkConf:
SPARK_JAVA_OPTS was detected (set to '-Dspark.driver.port=53411').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with conf/spark-defaults.conf to set defaults for an application
- ./spark-submit with --driver-java-options to set -X options for a driver
- spark.executor.extraJavaOptions to set -X options for executors
- SPARK_DAEMON_JAVA_OPTS to set java options for standalone daemons (master or worker)
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.executor.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.driver.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:18 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/03/19 05:59:18 INFO Remoting: Starting remoting
16/03/19 05:59:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.111.201:53411]
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'sparkDriver' on port 53411.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering MapOutputTracker
16/03/19 05:59:18 INFO spark.SparkEnv: Registering BlockManagerMaster
16/03/19 05:59:18 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-f70b1bb6-288b-4894-bb49-22d1fc3d8d89
16/03/19 05:59:18 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
16/03/19 05:59:18 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/httpd-fe0c46a2-1d87-4bc7-8b4f-adfc79cb762a
16/03/19 05:59:18 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:40258
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'HTTP file server' on port 40258.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SelectChannelConnector#0.0.0.0:4040
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/03/19 05:59:18 INFO ui.SparkUI: Started SparkUI at http://192.168.111.201:4040
16/03/19 05:59:19 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/03/19 05:59:19 INFO client.RMProxy: Connecting to ResourceManager at /192.168.111.201:8032
16/03/19 05:59:19 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/03/19 05:59:19 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/03/19 05:59:19 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/03/19 05:59:19 INFO yarn.Client: Setting up container launch context for our AM
16/03/19 05:59:19 INFO yarn.Client: Setting up the launch environment for our AM container
16/03/19 05:59:19 INFO yarn.Client: Preparing resources for our AM container
16/03/19 05:59:21 INFO yarn.Client: Uploading resource file:/opt/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/spark-assembly-1.5.2-hadoop2.6.0.jar
16/03/19 05:59:25 INFO yarn.Client: Uploading resource file:/tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/__spark_conf__2052137095112870542.zip -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/__spark_conf__2052137095112870542.zip
16/03/19 05:59:25 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:25 INFO yarn.Client: Submitting application 2 to ResourceManager
16/03/19 05:59:25 INFO impl.YarnClientImpl: Submitted application application_1458334003417_0002
16/03/19 05:59:26 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:26 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:27 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:28 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:29 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:30 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:31 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:32 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:33 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:34 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:35 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:46505/user/YarnAM#149895142])
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:35 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:46505] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 INFO yarn.Client: Application report for application_1458334003417_0002 (state: RUNNING)
16/03/19 05:59:35 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.111.203
ApplicationMaster RPC port: 0
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Application application_1458334003417_0002 has started running.
16/03/19 05:59:36 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42938.
16/03/19 05:59:36 INFO netty.NettyBlockTransferService: Server created on 42938
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/03/19 05:59:36 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.111.201:42938 with 534.5 MB RAM, BlockManagerId(driver, 192.168.111.201, 42938)
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Registered BlockManager
16/03/19 05:59:40 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:34633/user/YarnAM#-40449267])
16/03/19 05:59:40 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:40 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/03/19 05:59:41 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.111.201:4040
16/03/19 05:59:41 INFO scheduler.DAGScheduler: Stopping DAGScheduler
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Stopped
16/03/19 05:59:42 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/19 05:59:42 INFO storage.MemoryStore: MemoryStore cleared
16/03/19 05:59:42 INFO storage.BlockManager: BlockManager stopped
16/03/19 05:59:42 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/03/19 05:59:42 INFO spark.SparkContext: Successfully stopped SparkContext
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/03/19 05:59:49 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/03/19 05:59:49 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.
16/03/19 05:59:49 INFO hive.HiveContext: Initializing execution hive, version 1.2.1
16/03/19 05:59:49 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:49 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 05:59:50 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 05:59:50 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 05:59:50 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:51 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:53 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 05:59:56 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 05:59:57 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 05:59:57 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 05:59:58 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 05:59:58 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/e16dc45f-de41-4e69-9f73-c976cc3358c9_resources
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9/_tmp_space.db
16/03/19 05:59:58 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse
16/03/19 05:59:58 INFO hive.HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/03/19 05:59:58 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:59 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 06:00:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 06:00:00 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 06:00:00 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:01 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 06:00:04 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 06:00:04 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 06:00:05 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 06:00:05 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 06:00:05 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 06:00:06 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 06:00:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 06:00:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/b046e212-ccbd-4415-aec3-5b207f147fda_resources
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda/_tmp_space.db
16/03/19 06:00:06 INFO repl.SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.
scala> val lines = sc.textFile("hdfs:///input")
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
at org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:2063)
at org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:2076)
at org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:825)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
at $iwC$$iwC$$iwC.<init>(<console>:34)
at $iwC$$iwC.<init>(<console>:36)
at $iwC.<init>(<console>:38)
at <init>(<console>:40)
at .<init>(<console>:44)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I encountered this in my Spark Structured Streaming application when I forgot to include the following:
spark.streams.awaitAnyTermination()
Your YARN application exits immediately after it starts:
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
Then, SparkContext is closed, so any action on this context will throw the exception you see.
Check the "Application Master" logs (visible through YARN's UI) to see the cause for the failure. This could be a memory configuration issue, network issues (e.g. host unreachable) and more - the log on the driver side (which is what you pasted) won't tell you which one it is.