Related
En, hello everyone a quest troubled me a long time. I can run my spark app in standalone mode by this command
spark-submit --master spark://fuxiuyin-virtual-machine:7077 test_app.py
But this app fail to run in yarn cluster by this command
spark-submit --master yarn test_app.py
I think my yarn cluster is healthy.
The output of jps is
$ jps
8289 Worker
14882 NameNode
15475 ResourceManager
8134 Master
15751 NodeManager
15063 DataNode
17212 Jps
15295 SecondaryNameNode
And the 'Nodes of the cluster' page is
here
The output of spark-submit is
$ /opt/spark/bin/spark-submit --master yarn test_app.py
16/10/28 16:54:39 INFO spark.SparkContext: Running Spark version 2.0.1
16/10/28 16:54:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/28 16:54:39 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/28 16:54:39 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/28 16:54:39 INFO spark.SecurityManager: Changing view acls groups to:
16/10/28 16:54:39 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/28 16:54:39 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/28 16:54:39 INFO util.Utils: Successfully started service 'sparkDriver' on port 42519.
16/10/28 16:54:39 INFO spark.SparkEnv: Registering MapOutputTracker
16/10/28 16:54:39 INFO spark.SparkEnv: Registering BlockManagerMaster
16/10/28 16:54:39 INFO storage.DiskBlockManager: Created local directory at /opt/spark/blockmgr-1dcd1d1a-4cf4-4778-9b71-53e238a62c97
16/10/28 16:54:39 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
16/10/28 16:54:40 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/10/28 16:54:40 INFO util.log: Logging initialized #1843ms
16/10/28 16:54:40 INFO server.Server: jetty-9.2.z-SNAPSHOT
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#1b933891{/jobs,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#580d9060{/jobs/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3a8fb3d9{/jobs/job,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#744ecb1b{/jobs/job/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#761b32b3{/stages,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#42213280{/stages/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#5775066{/stages/stage,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7e355c0{/stages/stage/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#28426125{/stages/pool,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#63bcf39f{/stages/pool/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#5cf77bee{/storage,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#412768e5{/storage/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7ad772ad{/storage/rdd,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#7ef35663{/storage/rdd/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#193c7a58{/environment,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#63a649da{/environment/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#22251d19{/executors,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#46810770{/executors/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3c155b42{/executors/threadDump,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#6dac2d83{/executors/threadDump/json,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#67eb38fa{/static,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#291f19f0{/,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#3f4688da{/api,null,AVAILABLE}
16/10/28 16:54:40 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#338a7a84{/stages/stage/kill,null,AVAILABLE}
16/10/28 16:54:40 INFO server.ServerConnector: Started ServerConnector#7df0e73{HTTP/1.1}{fuxiuyin-virtual-machine:4040}
16/10/28 16:54:40 INFO server.Server: Started #1962ms
16/10/28 16:54:40 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/10/28 16:54:40 INFO ui.SparkUI: Bound SparkUI to fuxiuyin-virtual-machine, and started at http://192.168.102.133:4040
16/10/28 16:54:40 INFO client.RMProxy: Connecting to ResourceManager at fuxiuyin-virtual-machine/192.168.102.133:8032
16/10/28 16:54:41 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/10/28 16:54:41 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/10/28 16:54:41 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/10/28 16:54:41 INFO yarn.Client: Setting up container launch context for our AM
16/10/28 16:54:41 INFO yarn.Client: Setting up the launch environment for our AM container
16/10/28 16:54:41 INFO yarn.Client: Preparing resources for our AM container
16/10/28 16:54:41 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/10/28 16:54:42 INFO yarn.Client: Uploading resource file:/opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/__spark_libs__697818607740390689.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/__spark_libs__697818607740390689.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/python/lib/pyspark.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/pyspark.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/python/lib/py4j-0.10.3-src.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/py4j-0.10.3-src.zip
16/10/28 16:54:45 INFO yarn.Client: Uploading resource file:/opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/__spark_conf__7760765070208746118.zip -> hdfs://fuxiuyin-virtual-machine:9000/user/fuxiuyin/.sparkStaging/application_1477644823180_0001/__spark_conf__.zip
16/10/28 16:54:45 INFO spark.SecurityManager: Changing view acls to: fuxiuyin
16/10/28 16:54:45 INFO spark.SecurityManager: Changing modify acls to: fuxiuyin
16/10/28 16:54:45 INFO spark.SecurityManager: Changing view acls groups to:
16/10/28 16:54:45 INFO spark.SecurityManager: Changing modify acls groups to:
16/10/28 16:54:45 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(fuxiuyin); groups with view permissions: Set(); users with modify permissions: Set(fuxiuyin); groups with modify permissions: Set()
16/10/28 16:54:45 INFO yarn.Client: Submitting application application_1477644823180_0001 to ResourceManager
16/10/28 16:54:45 INFO impl.YarnClientImpl: Submitted application application_1477644823180_0001
16/10/28 16:54:45 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1477644823180_0001 and attemptId None
16/10/28 16:54:46 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:46 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1477644885891
final status: UNDEFINED
tracking URL: http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001/
user: fuxiuyin
16/10/28 16:54:47 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:48 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:49 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:50 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:51 INFO yarn.Client: Application report for application_1477644823180_0001 (state: ACCEPTED)
16/10/28 16:54:52 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/10/28 16:54:52 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> fuxiuyin-virtual-machine, PROXY_URI_BASES -> http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001), /proxy/application_1477644823180_0001
16/10/28 16:54:52 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/10/28 16:54:52 INFO yarn.Client: Application report for application_1477644823180_0001 (state: RUNNING)
16/10/28 16:54:52 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.102.133
ApplicationMaster RPC port: 0
queue: default
start time: 1477644885891
final status: UNDEFINED
tracking URL: http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001/
user: fuxiuyin
16/10/28 16:54:52 INFO cluster.YarnClientSchedulerBackend: Application application_1477644823180_0001 has started running.
16/10/28 16:54:52 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39951.
16/10/28 16:54:52 INFO netty.NettyBlockTransferService: Server created on 192.168.102.133:39951
16/10/28 16:54:53 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.102.133:39951 with 366.3 MB RAM, BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.102.133, 39951)
16/10/28 16:54:53 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#43ba5458{/metrics/json,null,AVAILABLE}
16/10/28 16:54:57 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/10/28 16:54:57 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> fuxiuyin-virtual-machine, PROXY_URI_BASES -> http://fuxiuyin-virtual-machine:8088/proxy/application_1477644823180_0001), /proxy/application_1477644823180_0001
16/10/28 16:54:57 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/10/28 16:54:59 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.102.133:45708) with ID 1
16/10/28 16:54:59 INFO storage.BlockManagerMasterEndpoint: Registering block manager fuxiuyin-virtual-machine:33074 with 366.3 MB RAM, BlockManagerId(1, fuxiuyin-virtual-machine, 33074)
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.102.133:45712) with ID 2
16/10/28 16:55:00 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/10/28 16:55:00 INFO storage.BlockManagerMasterEndpoint: Registering block manager fuxiuyin-virtual-machine:43740 with 366.3 MB RAM, BlockManagerId(2, fuxiuyin-virtual-machine, 43740)
16/10/28 16:55:00 INFO spark.SparkContext: Starting job: collect at /home/fuxiuyin/test_app.py:8
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Got job 0 (collect at /home/fuxiuyin/test_app.py:8) with 2 output partitions
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (collect at /home/fuxiuyin/test_app.py:8)
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Missing parents: List()
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at collect at /home/fuxiuyin/test_app.py:8), which has no missing parents
16/10/28 16:55:00 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.8 KB, free 366.3 MB)
16/10/28 16:55:00 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 366.3 MB)
16/10/28 16:55:00 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.102.133:39951 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:00 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/10/28 16:55:00 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (PythonRDD[1] at collect at /home/fuxiuyin/test_app.py:8)
16/10/28 16:55:00 INFO cluster.YarnScheduler: Adding task set 0.0 with 2 tasks
16/10/28 16:55:00 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, fuxiuyin-virtual-machine, partition 0, PROCESS_LOCAL, 5450 bytes)
16/10/28 16:55:00 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, fuxiuyin-virtual-machine, partition 1, PROCESS_LOCAL, 5469 bytes)
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: fuxiuyin-virtual-machine.
16/10/28 16:55:00 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: fuxiuyin-virtual-machine.
16/10/28 16:55:01 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/10/28 16:55:01 INFO server.ServerConnector: Stopped ServerConnector#7df0e73{HTTP/1.1}{fuxiuyin-virtual-machine:4040}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#338a7a84{/stages/stage/kill,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3f4688da{/api,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#291f19f0{/,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#67eb38fa{/static,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#6dac2d83{/executors/threadDump/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3c155b42{/executors/threadDump,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#46810770{/executors/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#22251d19{/executors,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#63a649da{/environment/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#193c7a58{/environment,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7ef35663{/storage/rdd/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7ad772ad{/storage/rdd,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#412768e5{/storage/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#5cf77bee{/storage,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#63bcf39f{/stages/pool/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#28426125{/stages/pool,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#7e355c0{/stages/stage/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#5775066{/stages/stage,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#42213280{/stages/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#761b32b3{/stages,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#744ecb1b{/jobs/job/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#3a8fb3d9{/jobs/job,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#580d9060{/jobs/json,null,UNAVAILABLE}
16/10/28 16:55:01 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler#1b933891{/jobs,null,UNAVAILABLE}
16/10/28 16:55:01 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.102.133:4040
16/10/28 16:55:01 INFO scheduler.DAGScheduler: Job 0 failed: collect at /home/fuxiuyin/test_app.py:8, took 0.383872 s
16/10/28 16:55:01 INFO scheduler.DAGScheduler: ResultStage 0 (collect at /home/fuxiuyin/test_app.py:8) failed in 0.233 s
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo#469337f1)
Traceback (most recent call last):
File "/home/fuxiuyin/test_app.py", line 8, in <module>
print(data.collect())
File "/opt/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 776, in collect
File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1477644901073,JobFailed(org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down))
: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:818)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:816)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:816)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1685)
at org.apache.spark.util.EventLoop.stop(EventLoop.scala:83)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1604)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1798)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1797)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1916)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1930)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:453)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
16/10/28 16:55:01 ERROR client.TransportClient: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/10/28 16:55:01 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/10/28 16:55:01 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
16/10/28 16:55:01 ERROR util.Utils: Uncaught exception in thread Yarn application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:508)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:93)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:151)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:455)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1605)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1798)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1287)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1797)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:108)
Caused by: java.io.IOException: Failed to send RPC 9187551343857476032 to /192.168.102.133:45698: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
16/10/28 16:55:01 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/28 16:55:01 INFO storage.DiskBlockManager: Shutdown hook called
16/10/28 16:55:01 INFO util.ShutdownHookManager: Shutdown hook called
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/userFiles-f51df2cd-8ec0-4caa-862f-77db0cc72505
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898/pyspark-5216f977-d3c3-495f-b91a-88fa2218696d
16/10/28 16:55:01 INFO util.ShutdownHookManager: Deleting directory /opt/spark/spark-97ecc15d-7f26-4b73-a67e-953fdc127898
16/10/28 16:55:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on fuxiuyin-virtual-machine:43740 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(2, fuxiuyin-virtual-machine, 43740),broadcast_0_piece0,StorageLevel(memory, 1 replicas),2517,0))
16/10/28 16:55:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on fuxiuyin-virtual-machine:33074 (size: 2.5 KB, free: 366.3 MB)
16/10/28 16:55:01 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(1, fuxiuyin-virtual-machine, 33074),broadcast_0_piece0,StorageLevel(memory, 1 replicas),2517,0))
16/10/28 16:55:01 INFO memory.MemoryStore: MemoryStore cleared
16/10/28 16:55:01 INFO storage.BlockManager: BlockManager stopped
And the log of yarn resourcemanager is in
yarn-fuxiuyin-resourcemanager-fuxiuyin-virtual-machine.log
I submit app by this user:
uid=1000(fuxiuyin) gid=1000(fuxiuyin) 组=1000(fuxiuyin),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),108(lpadmin),124(sambashare)
My test_app is
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName("test_app")
sc = SparkContext(conf=conf)
data = sc.parallelize([1, 2, 3])
data = data.map(lambda x: x + 1)
print(data.collect())
I don't how to fix it.
Thinks.
The driver has to collect all the data from worker nodes before printing so use the code below..
i think the error is due to
print(data.collect())
use
for x in data.collect():
print x
and use spark submit as:
spark-submit --master yarn deploy-mode cluster test_app.py
instead of spark-submit --master yarn test_app.py
try this command spark-submit --master yarn-client test_app.py
When I put the 'val lines = sc.textFile("hdfs:///input")' in yarn-client, 'Cannot call methods on a stopped SparkContext' error occur. I searched all day long for two days, but I don't know where is cause. "hdfs:///input" is right, because when I executed it in standalone mode, I worked well.
Could you give me a any idea of that?
I'm using spark 1.5.2, hadoop 2.7.2.
tarting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
192.168.111.203: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
192.168.111.202: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
[root#master spark-1.5.2-bin-hadoop2.6]# bin/spark-shell --master yarn-client
16/03/19 05:59:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 05:59:12 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:13 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:13 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:13 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:46780
16/03/19 05:59:13 INFO util.Utils: Successfully started service 'HTTP class server' on port 46780.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.2
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_73)
Type in expressions to have them evaluated.
Type :help for more information.
16/03/19 05:59:17 INFO spark.SparkContext: Running Spark version 1.5.2
16/03/19 05:59:17 WARN spark.SparkConf:
SPARK_JAVA_OPTS was detected (set to '-Dspark.driver.port=53411').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with conf/spark-defaults.conf to set defaults for an application
- ./spark-submit with --driver-java-options to set -X options for a driver
- spark.executor.extraJavaOptions to set -X options for executors
- SPARK_DAEMON_JAVA_OPTS to set java options for standalone daemons (master or worker)
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.executor.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 WARN spark.SparkConf: Setting 'spark.driver.extraJavaOptions' to '-Dspark.driver.port=53411' as a work-around.
16/03/19 05:59:17 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:18 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/03/19 05:59:18 INFO Remoting: Starting remoting
16/03/19 05:59:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.111.201:53411]
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'sparkDriver' on port 53411.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering MapOutputTracker
16/03/19 05:59:18 INFO spark.SparkEnv: Registering BlockManagerMaster
16/03/19 05:59:18 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-f70b1bb6-288b-4894-bb49-22d1fc3d8d89
16/03/19 05:59:18 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
16/03/19 05:59:18 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/httpd-fe0c46a2-1d87-4bc7-8b4f-adfc79cb762a
16/03/19 05:59:18 INFO spark.HttpServer: Starting HTTP Server
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SocketConnector#0.0.0.0:40258
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'HTTP file server' on port 40258.
16/03/19 05:59:18 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/03/19 05:59:18 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 05:59:18 INFO server.AbstractConnector: Started SelectChannelConnector#0.0.0.0:4040
16/03/19 05:59:18 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/03/19 05:59:18 INFO ui.SparkUI: Started SparkUI at http://192.168.111.201:4040
16/03/19 05:59:19 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/03/19 05:59:19 INFO client.RMProxy: Connecting to ResourceManager at /192.168.111.201:8032
16/03/19 05:59:19 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
16/03/19 05:59:19 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/03/19 05:59:19 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/03/19 05:59:19 INFO yarn.Client: Setting up container launch context for our AM
16/03/19 05:59:19 INFO yarn.Client: Setting up the launch environment for our AM container
16/03/19 05:59:19 INFO yarn.Client: Preparing resources for our AM container
16/03/19 05:59:21 INFO yarn.Client: Uploading resource file:/opt/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/spark-assembly-1.5.2-hadoop2.6.0.jar
16/03/19 05:59:25 INFO yarn.Client: Uploading resource file:/tmp/spark-58591b6b-5b19-4bc0-a993-0b846de5ef6f/__spark_conf__2052137095112870542.zip -> hdfs://192.168.111.201:9000/user/root/.sparkStaging/application_1458334003417_0002/__spark_conf__2052137095112870542.zip
16/03/19 05:59:25 INFO spark.SecurityManager: Changing view acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: Changing modify acls to: root
16/03/19 05:59:25 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/19 05:59:25 INFO yarn.Client: Submitting application 2 to ResourceManager
16/03/19 05:59:25 INFO impl.YarnClientImpl: Submitted application application_1458334003417_0002
16/03/19 05:59:26 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:26 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:27 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:28 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:29 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:30 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:31 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:32 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:33 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:34 INFO yarn.Client: Application report for application_1458334003417_0002 (state: ACCEPTED)
16/03/19 05:59:35 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:46505/user/YarnAM#149895142])
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:35 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:46505] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:35 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:46505
16/03/19 05:59:35 INFO yarn.Client: Application report for application_1458334003417_0002 (state: RUNNING)
16/03/19 05:59:35 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.111.203
ApplicationMaster RPC port: 0
queue: default
start time: 1458334765746
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1458334003417_0002/
user: root
16/03/19 05:59:35 INFO cluster.YarnClientSchedulerBackend: Application application_1458334003417_0002 has started running.
16/03/19 05:59:36 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42938.
16/03/19 05:59:36 INFO netty.NettyBlockTransferService: Server created on 42938
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/03/19 05:59:36 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.111.201:42938 with 534.5 MB RAM, BlockManagerId(driver, 192.168.111.201, 42938)
16/03/19 05:59:36 INFO storage.BlockManagerMaster: Registered BlockManager
16/03/19 05:59:40 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM#192.168.111.203:34633/user/YarnAM#-40449267])
16/03/19 05:59:40 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1458334003417_0002), /proxy/application_1458334003417_0002
16/03/19 05:59:40 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/03/19 05:59:41 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/03/19 05:59:41 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.111.201:4040
16/03/19 05:59:41 INFO scheduler.DAGScheduler: Stopping DAGScheduler
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/03/19 05:59:41 INFO cluster.YarnClientSchedulerBackend: Stopped
16/03/19 05:59:42 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/19 05:59:42 INFO storage.MemoryStore: MemoryStore cleared
16/03/19 05:59:42 INFO storage.BlockManager: BlockManager stopped
16/03/19 05:59:42 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/03/19 05:59:42 INFO spark.SparkContext: Successfully stopped SparkContext
16/03/19 05:59:42 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/03/19 05:59:49 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/03/19 05:59:49 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.
16/03/19 05:59:49 INFO hive.HiveContext: Initializing execution hive, version 1.2.1
16/03/19 05:59:49 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:49 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 05:59:50 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 05:59:50 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 05:59:50 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 05:59:50 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:51 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 05:59:53 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:54 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:56 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 05:59:56 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 05:59:57 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 05:59:57 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 05:59:57 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 05:59:58 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 05:59:58 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 05:59:58 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 05:59:58 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/e16dc45f-de41-4e69-9f73-c976cc3358c9_resources
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created local directory: /tmp/root/e16dc45f-de41-4e69-9f73-c976cc3358c9
16/03/19 05:59:58 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/e16dc45f-de41-4e69-9f73-c976cc3358c9/_tmp_space.db
16/03/19 05:59:58 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse
16/03/19 05:59:58 INFO hive.HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/03/19 05:59:58 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0
16/03/19 05:59:59 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/03/19 06:00:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 06:00:00 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/03/19 06:00:00 INFO metastore.ObjectStore: ObjectStore, initialize called
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/03/19 06:00:00 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:00 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/03/19 06:00:01 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:02 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:04 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/03/19 06:00:04 INFO metastore.ObjectStore: Initialized ObjectStore
16/03/19 06:00:04 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/03/19 06:00:05 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added admin role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: Added public role in metastore
16/03/19 06:00:05 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/03/19 06:00:05 INFO metastore.HiveMetaStore: 0: get_all_databases
16/03/19 06:00:05 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/03/19 06:00:06 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/19 06:00:06 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/19 06:00:06 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/b046e212-ccbd-4415-aec3-5b207f147fda_resources
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created local directory: /tmp/root/b046e212-ccbd-4415-aec3-5b207f147fda
16/03/19 06:00:06 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/b046e212-ccbd-4415-aec3-5b207f147fda/_tmp_space.db
16/03/19 06:00:06 INFO repl.SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.
scala> val lines = sc.textFile("hdfs:///input")
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
at org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:2063)
at org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:2076)
at org.apache.spark.SparkContext.textFile$default$2(SparkContext.scala:825)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
at $iwC$$iwC$$iwC.<init>(<console>:34)
at $iwC$$iwC.<init>(<console>:36)
at $iwC.<init>(<console>:38)
at <init>(<console>:40)
at .<init>(<console>:44)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I encountered this in my Spark Structured Streaming application when I forgot to include the following:
spark.streams.awaitAnyTermination()
Your YARN application exits immediately after it starts:
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster has disassociated: 192.168.111.203:34633
16/03/19 05:59:41 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM#192.168.111.203:34633] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/03/19 05:59:41 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
Then, SparkContext is closed, so any action on this context will throw the exception you see.
Check the "Application Master" logs (visible through YARN's UI) to see the cause for the failure. This could be a memory configuration issue, network issues (e.g. host unreachable) and more - the log on the driver side (which is what you pasted) won't tell you which one it is.
Previously I have run spark job using command line and spark-submit command.
sudo -su root spark-submit --executor-memory 512m --num-executors 50 --class com.mycompany.project.SparkHdfsToHBase --master yarn parser-1.0-jar-with-dependencies.jar 0 hdfs://company.com:8020/tmp/testfiles/html_files/* 127.0.0.1 --verbose
And everything works like a charm.
Now I am trying to run spark job through oozie workflow. I have created workflow and add shell action with a python script.
Python script looks like:
#!/usr/bin/env python
import subprocess
subprocess.call(["spark-submit", "--class", "com.mycompany.project.SparkHdfsToHBase", "--executor-memory", "512m", "--driver-memory", "512m", "--master", "yarn", "file:///parser-1.0-jar-with-dependencies.jar", "0", "hdfs://company.com:8020/tmp/testfiles/html_files/*", "127.0.0.1", "--verbose"])
I haven't added any other parameters for job, only path to file. When I am submitting job, at job browser my job is stucked on 5%.
I haven't got any useful information from logs:
Syslog:
2015-03-16 11:46:51,152 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-03-16 11:46:51,152 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2015-03-16 11:46:51,154 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2015-03-16 11:46:51,157 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2015-03-16 11:46:51,172 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-03-16 11:46:51,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2015-03-16 11:46:51,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier#46ea3050)
2015-03-16 11:46:51,393 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: RM_DELEGATION_TOKEN, Service: 172.24.4.231:8032, Ident: (owner=admin, renewer=oozie mr token, realUser=oozie, issueDate=1426499201955, maxDate=1427104001955, sequenceNumber=6, masterKeyId=2)
2015-03-16 11:46:51,631 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2015-03-16 11:46:51,633 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-03-16 11:46:51,637 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2015-03-16 11:46:51,640 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2015-03-16 11:46:51,644 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2015-03-16 11:46:51,661 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-03-16 11:46:52,495 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-03-16 11:46:52,803 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2015-03-16 11:46:52,805 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
2015-03-16 11:46:52,868 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2015-03-16 11:46:52,869 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2015-03-16 11:46:52,870 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2015-03-16 11:46:52,871 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2015-03-16 11:46:52,872 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2015-03-16 11:46:52,873 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2015-03-16 11:46:52,874 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2015-03-16 11:46:52,875 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2015-03-16 11:46:52,967 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2015-03-16 11:46:53,241 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-03-16 11:46:53,303 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-03-16 11:46:53,303 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2015-03-16 11:46:53,320 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1426498753001_0001 to jobTokenSecretManager
2015-03-16 11:46:53,447 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1426498753001_0001 because: not enabled;
2015-03-16 11:46:53,473 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1426498753001_0001 = 0. Number of splits = 1
2015-03-16 11:46:53,473 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1426498753001_0001 = 0
2015-03-16 11:46:53,473 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1426498753001_0001Job Transitioned from NEW to INITED
2015-03-16 11:46:53,474 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1426498753001_0001.
2015-03-16 11:46:53,532 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-03-16 11:46:53,542 INFO [Socket Reader #1 for port 44690] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 44690
2015-03-16 11:46:53,564 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2015-03-16 11:46:53,565 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-03-16 11:46:53,566 INFO [IPC Server listener on 44690] org.apache.hadoop.ipc.Server: IPC Server listener on 44690: starting
2015-03-16 11:46:53,566 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at company.com/172.24.4.231:44690
2015-03-16 11:46:53,641 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-03-16 11:46:53,646 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2015-03-16 11:46:53,658 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-03-16 11:46:53,663 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2015-03-16 11:46:53,663 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2015-03-16 11:46:53,667 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2015-03-16 11:46:53,667 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2015-03-16 11:46:53,678 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 57328
2015-03-16 11:46:53,678 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2015-03-16 11:46:53,706 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-yarn-common-2.5.0-cdh5.3.1.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_57328_mapreduce____37uqf6/webapp
2015-03-16 11:46:54,346 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:57328
2015-03-16 11:46:54,346 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 57328
2015-03-16 11:46:55,005 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2015-03-16 11:46:55,012 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-03-16 11:46:55,012 INFO [Socket Reader #1 for port 60762] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 60762
2015-03-16 11:46:55,021 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-03-16 11:46:55,023 INFO [IPC Server listener on 60762] org.apache.hadoop.ipc.Server: IPC Server listener on 60762: starting
2015-03-16 11:46:55,111 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2015-03-16 11:46:55,111 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2015-03-16 11:46:55,111 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2015-03-16 11:46:55,369 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2015-03-16 11:46:55,369 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-03-16 11:46:55,369 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2015-03-16 11:46:55,370 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2015-03-16 11:46:55,375 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2015-03-16 11:46:55,383 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-03-16 11:46:55,388 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at company.com/172.24.4.231:8030
2015-03-16 11:46:55,574 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:8192, vCores:2>
2015-03-16 11:46:55,574 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.admin
2015-03-16 11:46:55,582 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2015-03-16 11:46:55,584 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
2015-03-16 11:46:55,599 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1426498753001_0001Job Transitioned from INITED to SETUP
2015-03-16 11:46:55,611 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2015-03-16 11:46:55,633 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1426498753001_0001Job Transitioned from SETUP to RUNNING
2015-03-16 11:46:55,714 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1426498753001_0001_m_000000 Task Transitioned from NEW to SCHEDULED
2015-03-16 11:46:55,723 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1426498753001_0001_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2015-03-16 11:46:55,738 INFO [Thread-50] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2015-03-16 11:46:55,797 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1426498753001_0001, File: hdfs://company.com:8020/user/admin/.staging/job_1426498753001_0001/job_1426498753001_0001_1.jhist
2015-03-16 11:46:56,584 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2015-03-16 11:46:56,658 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1426498753001_0001: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:7168, vCores:0> knownNMs=1
Stderr:
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
Mar 16, 2015 11:47:04 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Mar 16, 2015 11:47:05 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Mar 16, 2015 11:47:05 AM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to #Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Mar 16, 2015 11:47:05 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
I'll be happy to get any help. Thanks.
Am working on Hortonworks Hive.
I have seen same type of errors. But underlying MapReduce error seems to be different here in the case as Application error with exitCode 1.
In Hive, the statement
Select * from SomeTable;
...Is working fine, but
Select colName from SomeTable;
...Is not working.
Application error log
2014-03-17 12:49:15,557 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1395039411618_0001 State change from ACCEPTED to FAILED
2014-03-17 12:49:15,558 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application appattempt_1395039411618_0001_000002 is done. finalState=FAILED
2014-03-17 12:49:15,559 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1395039411618_0001 requests cleared
2014-03-17 12:49:15,559 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1395039411618_0001 user: asande queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2014-03-17 12:49:15,559 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1395039411618_0001 user: asande leaf-queue of parent: root #applications: 0
2014-03-17 12:49:15,559 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=asande OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1395039411618_0001 failed 2 times due to AM Container for appattempt_1395039411618_0001_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
Here's Hive.log. (But it seems there's nothing wrong in the log.)
<code>
2014-03-17 10:45:37,322 INFO server.HiveServer2 (HiveStringUtils.java:startupShutdownMessage(604)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting HiveServer2
STARTUP_MSG: host = ASANDE1/16.155.82.203
STARTUP_MSG: args = [-hiveconf, hive.hadoop.classpath=c:\hdp\hive-0.12.0.2.0.6.0-0009\lib\*.............................................., -hiveconf, hive.querylog.location=c:\hadoop\logs\hive\history, -hiveconf, hive.log.dir=c:\hadoop\logs\hive]
STARTUP_MSG: version = 0.12.0.2.0.6.0-0009
STARTUP_MSG: classpath = c:\hdp\hadoop-2.2.0.2.0.6.0-0009\etc\hadoop;c:\hdp\hadoop-;;
STARTUP_MSG: build = git://sijenkins-vm3/cygdrive/d/w/bw/project/hive-monarch -r a7f54db5645b645500778b92e7fad8fab7738080; compiled by 'jenkins' on Fri Dec 20 18:29:58 PST 2013
************************************************************/
2014-03-17 10:45:39,622 INFO service.CompositeService (SessionManager.java:init(60)) - HiveServer2: Async execution thread pool size: 100
2014-03-17 10:45:39,622 INFO service.CompositeService (SessionManager.java:init(62)) - HiveServer2: Async execution wait queue size: 100
2014-03-17 10:45:39,623 INFO service.CompositeService (SessionManager.java:init(64)) - HiveServer2: Async execution thread keepalive time: 10
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:init(89)) - Service:OperationManager is inited.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:init(89)) - Service:SessionManager is inited.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:init(89)) - Service:CLIService is inited.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:init(89)) - Service:ThriftBinaryCLIService is inited.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:init(89)) - Service:HiveServer2 is inited.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:start(104)) - Service:OperationManager is started.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:start(104)) - Service:SessionManager is started.
2014-03-17 10:45:39,628 INFO service.AbstractService (AbstractService.java:start(104)) - Service:CLIService is started.
2014-03-17 10:45:39,925 INFO hive.metastore (HiveMetaStoreClient.java:open(244)) - Trying to connect to metastore with URI thrift://ASANDE1:9083
2014-03-17 10:46:02,239 WARN hive.metastore (HiveMetaStoreClient.java:open(307)) - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:2822)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:2808)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 14 more
2014-03-17 10:46:02,280 INFO hive.metastore (HiveMetaStoreClient.java:open(322)) - Waiting 1 seconds before next connection attempt.
2014-03-17 10:46:03,280 INFO hive.metastore (HiveMetaStoreClient.java:open(332)) - Connected to metastore.
2014-03-17 10:46:08,455 ERROR hive.log (MetaStoreUtils.java:logAndThrowMetaException(960)) - Got exception: org.apache.thrift.TApplicationException get_databases failed: out of sequence response
org.apache.thrift.TApplicationException: get_databases failed: out of sequence response
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:500)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:487)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:722)
at org.apache.hive.service.cli.CLIService.start(CLIService.java:83)
at org.apache.hive.service.CompositeService.start(CompositeService.java:70)
at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:103)
2014-03-17 10:46:08,455 ERROR hive.log (MetaStoreUtils.java:logAndThrowMetaException(961)) - Converting exception to MetaException
2014-03-17 10:46:08,621 ERROR service.CompositeService (CompositeService.java:start(74)) - Error starting services HiveServer2
org.apache.hive.service.ServiceException: Unable to connect to MetaStore!
at org.apache.hive.service.cli.CLIService.start(CLIService.java:85)
..........................
Caused by: MetaException(message:Got exception: org.apache.thrift.TApplicationException get_databases failed: out of sequence response)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:962)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:724)
at org.apache.hive.service.cli.CLIService.start(CLIService.java:83)
... 3 more
2014-03-17 10:46:08,627 INFO service.AbstractService (AbstractService.java:stop(125)) - Service:OperationManager is stopped.
2014-03-17 10:46:08,627 INFO service.AbstractService (AbstractService.java:stop(125)) - Service:SessionManager is stopped.
2014-03-17 10:46:08,627 INFO service.AbstractService (AbstractService.java:stop(125)) - Service:CLIService is stopped.
2014-03-17 10:46:08,627 FATAL server.HiveServer2 (HiveServer2.java:main(105)) - Error starting HiveServer2
org.apache.hive.service.ServiceException: Failed to Start HiveServer2
at org.apache.hive.service.CompositeService.start(CompositeService.java:80)
at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:73)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:103)
Caused by: org.apache.hive.service.ServiceException: Unable to connect to MetaStore!
at org.apache.hive.service.cli.CLIService.start(CLIService.java:85)
at org.apache.hive.service.CompositeService.start(CompositeService.java:70)
... 2 more
Caused by: MetaException(message:Got exception: org.apache.thrift.TApplicationException get_databases failed: out of sequence response)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:962)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:724)
at org.apache.hive.service.cli.CLIService.start(CLIService.java:83)
... 3 more
2014-03-17 10:46:08,791 INFO server.HiveServer2 (HiveStringUtils.java:run(622)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down HiveServer2 at ASANDE1/16.155.82.203
************************************************************/
2014-03-17 10:46:22,805 INFO server.HiveServer2 (HiveStringUtils.java:startupShutdownMessage(604)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting HiveServer2
STARTUP_MSG: host = ASANDE1/16.155.82.203
STARTUP_MSG: args = [-hiveconf, hive.hadoop.classpath=c:\hdp\hive-0.12.0.2.0.6.0-0009\lib\*, -hiveconf, hive.querylog.location=c:\hadoop\logs\hive\history, -hiveconf, hive.log.dir=c:\hadoop\logs\hive]
STARTUP_MSG: version = 0.12.0.2.0.6.0-0009
STARTUP_MSG: classpath = c:\hdp\hadoop-2.2.0.2.0.6.0-0009\etc\hadoop;c:\hdp\hadoop-2.2.0.2.0.6.0-0009\share\hadoop\common\lib\activation-1.1.jar;c:\hdp\hadoop-2.2.0.2.0.6.0-
...............................................................
STARTUP_MSG: build = git://sijenkins-vm3/cygdrive/d/w/bw/project/hive-monarch -r a7f54db5645b645500778b92e7fad8fab7738080; compiled by 'jenkins' on Fri Dec 20 18:29:58 PST 2013
************************************************************/
2014-03-17 10:46:23,677 INFO service.CompositeService (SessionManager.java:init(60)) - HiveServer2: Async execution thread pool size: 100
2014-03-17 10:46:23,677 INFO service.CompositeService (SessionManager.java:init(62)) - HiveServer2: Async execution wait queue size: 100
2014-03-17 10:46:23,678 INFO service.CompositeService (SessionManager.java:init(64)) - HiveServer2: Async execution thread keepalive time: 10
2014-03-17 10:46:23,682 INFO service.AbstractService (AbstractService.java:init(89)) - Service:OperationManager is inited.
2014-03-17 10:46:23,682 INFO service.AbstractService (AbstractService.java:init(89)) - Service:SessionManager is inited.
2014-03-17 10:46:23,682 INFO service.AbstractService (AbstractService.java:init(89)) - Service:CLIService is inited.
2014-03-17 10:46:23,683 INFO service.AbstractService (AbstractService.java:init(89)) - Service:ThriftBinaryCLIService is inited.
2014-03-17 10:46:23,683 INFO service.AbstractService (AbstractService.java:init(89)) - Service:HiveServer2 is inited.
2014-03-17 10:46:23,683 INFO service.AbstractService (AbstractService.java:start(104)) - Service:OperationManager is started.
2014-03-17 10:46:23,683 INFO service.AbstractService (AbstractService.java:start(104)) - Service:SessionManager is started.
2014-03-17 10:46:23,683 INFO service.AbstractService (AbstractService.java:start(104)) - Service:CLIService is started.
2014-03-17 10:46:23,694 INFO hive.metastore (HiveMetaStoreClient.java:open(244)) - Trying to connect to metastore with URI thrift://ASANDE1:9083
2014-03-17 10:46:24,093 INFO hive.metastore (HiveMetaStoreClient.java:open(322)) - Waiting 1 seconds before next connection attempt.
2014-03-17 10:46:25,093 INFO hive.metastore (HiveMetaStoreClient.java:open(332)) - Connected to metastore.
2014-03-17 10:46:25,118 INFO service.AbstractService (AbstractService.java:start(104)) - Service:ThriftBinaryCLIService is started.
2014-03-17 10:46:25,122 INFO service.AbstractService (AbstractService.java:start(104)) - Service:HiveServer2 is started.
2014-03-17 10:46:25,351 INFO thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(76)) - ThriftBinaryCLIService listening on 0.0.0.0/0.0.0.0:10001
2014-03-17 12:27:04,409 INFO server.HiveServer2 (HiveStringUtils.java:startupShutdownMessage(604)) - STARTUP_MSG:
/************************************************************</code>
The query that worked for you does not launch a Map Reduce Job. It looks like there is a problem while launching Map Reduce Jobs. Can you check the hive logs (default location, assuming you are running hiveserver2 as user hive) – /tmp/hive/hive.log to see if it has the full error message.
One common error that you might find in the logs is the permission denied error. Mostly the ODBC driver logs on to as user hadoop which has no write access thus causing a failure in starting a Map Reduce Job. If you change this user to hdfs, your problem might be solved.
The difference between the Select * from SomeTable; and Select colName from SomeTable; queries you ran is that the later, Select colName from SomeTable; requires information from the Hive Metastore service. Since you are looking for a specific column, hive must first check the hive schema definition which is stored in the Hive Metastore.
Your issue is due to the fact that you cannot connect to the Hive Metastore, as indicated by the following message in your log file:
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
Try restarting your Hive Metastore service.
Hmaster getting aborted after starting ./start-hbase.sh in hbase-0.96.0 with hadoop 2.2.0.
Tried with hbase-0.94.16 and hbase-0.98 but same result. Hmaster aborts as soon as it starts. Even tried with replacing jars in hbase lib manually as well as using maven but the issue is unresolved. Is there any other solution?
Below is the corresponding hbase-hadoop-master-hadoop-master.log...
2014-02-24 10:11:27,078 INFO [Replication.RpcServer.handler=2,port=60000] ipc.RpcServer: Replication.RpcServer.handler=2,port=60000: starting
2014-02-24 10:11:27,565 INFO [RpcServer.handler=23,port=60000] ipc.RpcServer: RpcServer.handler=23,port=60000: starting
2014-02-24 10:11:27,970 INFO [master:hadoop-master:60000] mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2014-02-24 10:11:28,172 INFO [master:hadoop-master:60000] http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2014-02-24 10:11:28,177 INFO [master:hadoop-master:60000] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context master
2014-02-24 10:11:28,177 INFO [master:hadoop-master:60000] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2014-02-24 10:11:28,191 INFO [master:hadoop-master:60000] http.HttpServer: Jetty bound to port 60010
2014-02-24 10:11:28,191 INFO [master:hadoop-master:60000] mortbay.log: jetty-6.1.26
2014-02-24 10:11:29,227 INFO [master:hadoop-master:60000] mortbay.log: Started SelectChannelConnector#0.0.0.0:60010
2014-02-24 10:11:29,623 INFO [master:hadoop-master:60000] master.ActiveMasterManager: Registered Active Master=hadoop-master.payoda.com,60000,1393236677609
2014-02-24 10:11:29,629 INFO [master:hadoop-master:60000] Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
2014-02-24 10:11:29,851 DEBUG [main-EventThread] master.ActiveMasterManager: A master is now available
2014-02-24 10:11:30,537 INFO [master:hadoop-master:60000] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2014-02-24 10:11:30,800 DEBUG [master:hadoop-master:60000] util.FSTableDescriptors: Current tableInfoPath = hdfs://hadoop-master:9000/hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001
2014-02-24 10:11:30,821 DEBUG [master:hadoop-master:60000] util.FSTableDescriptors: TableInfo already exists.. Skipping creation
2014-02-24 10:11:30,944 INFO [master:hadoop-master:60000] fs.HFileSystem: Added intercepting call to namenode#getBlockLocations so can do block reordering using class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks
2014-02-24 10:11:30,950 INFO [master:hadoop-master:60000] master.SplitLogManager: Timeout=120000, unassigned timeout=180000, distributedLogReplay=false
2014-02-24 10:11:30,956 INFO [master:hadoop-master:60000] master.SplitLogManager: Found 0 orphan tasks and 0 rescan nodes
2014-02-24 10:11:31,000 INFO [master:hadoop-master:60000] zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.14.35:2181 sessionTimeout=90000 watcher=hconnection-0x4a867fad
2014-02-24 10:11:31,012 INFO [master:hadoop-master:60000-SendThread(hadoop-master.payoda.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server hadoop-master.payoda.com/192.168.14.35:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-02-24 10:11:31,617 INFO [master:hadoop-master:60000-SendThread(hadoop-master.payoda.com:2181)] zookeeper.ClientCnxn: Socket connection established to hadoop-master.payoda.com/192.168.14.35:2181, initiating session
2014-02-24 10:11:31,617 INFO [master:hadoop-master:60000] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4a867fad connecting to ZooKeeper ensemble=192.168.14.35:2181
2014-02-24 10:11:31,620 INFO [master:hadoop-master:60000-SendThread(hadoop-master.payoda.com:2181)] zookeeper.ClientCnxn: Session establishment complete on server hadoop-master.payoda.com/192.168.14.35:2181, sessionid = 0x1446360aa4a0001, negotiated timeout = 90000
2014-02-24 10:11:31,640 DEBUG [master:hadoop-master:60000] catalog.CatalogTracker: Starting catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#3eaa3e5b
**2014-02-24 10:11:31,684 FATAL [master:hadoop-master:60000] master.HMaster: Unhandled exception. Starting shutdown.
java.lang.IllegalArgumentException: .META. no longer exists. The table has been renamed to hbase:meta**
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:292)
at org.apache.hadoop.hbase.zookeeper.ZKTable.populateTableStates(ZKTable.java:82)
at org.apache.hadoop.hbase.zookeeper.ZKTable.<init>(ZKTable.java:69)
at org.apache.hadoop.hbase.master.AssignmentManager.<init>(AssignmentManager.java:281)
at org.apache.hadoop.hbase.master.HMaster.initializeZKBasedSystemTrackers(HMaster.java:677)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:809)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:603)
at java.lang.Thread.run(Thread.java:662)
2014-02-24 10:11:31,684 INFO [master:hadoop-master:60000] master.HMaster: Aborting
2014-02-24 10:11:31,711 DEBUG [master:hadoop-master:60000] catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#3eaa3e5b
2014-02-24 10:11:31,712 DEBUG [master:hadoop-master:60000] master.HMaster: Stopping service threads
2014-02-24 10:11:31,712 INFO [master:hadoop-master:60000] ipc.RpcServer: Stopping server on 60000
2014-02-24 10:11:31,712 INFO [RpcServer.handler=15,port=60000] ipc.RpcServer: RpcServer.handler=15,port=60000: exiting
2014-02-24 10:11:31,712 INFO [RpcServer.handler=23,port=60000] ipc.RpcServer: RpcServer.handler=23,port=60000: exiting
2014-02-24 10:11:32,129 INFO [master:hadoop-master:60000] master.HMaster: Stopping infoServer
2014-02-24 10:11:32,138 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2014-02-24 10:11:32,138 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2014-02-24 10:11:32,304 INFO [hadoop-master.payoda.com,60000,1393236677609.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor: hadoop-master.payoda.com,60000,1393236677609.splitLogManagerTimeoutMonitor exiting
2014-02-24 10:11:32,304 INFO [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: stopping
2014-02-24 10:11:32,304 INFO [Replication.RpcServer.handler=2,port=60000] ipc.RpcServer: Replication.RpcServer.handler=2,port=60000: exiting
2014-02-24 10:11:32,304 INFO [Replication.RpcServer.handler=1,port=60000] ipc.RpcServer: Replication.RpcServer.handler=1,port=60000: exiting
2014-02-24 10:11:32,304 INFO [Replication.RpcServer.handler=0,port=60000] ipc.RpcServer: Replication.RpcServer.handler=0,port=60000: exiting
2014-02-24 10:11:32,304 INFO [RpcServer.handler=29,port=60000] ipc.RpcServer: RpcServer.handler=29,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=28,port=60000] ipc.RpcServer: RpcServer.handler=28,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=27,port=60000] ipc.RpcServer: RpcServer.handler=27,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=26,port=60000] ipc.RpcServer: RpcServer.handler=26,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=25,port=60000] ipc.RpcServer: RpcServer.handler=25,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=24,port=60000] ipc.RpcServer: RpcServer.handler=24,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=22,port=60000] ipc.RpcServer: RpcServer.handler=22,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=21,port=60000] ipc.RpcServer: RpcServer.handler=21,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=20,port=60000] ipc.RpcServer: RpcServer.handler=20,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=19,port=60000] ipc.RpcServer: RpcServer.handler=19,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=18,port=60000] ipc.RpcServer: RpcServer.handler=18,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=17,port=60000] ipc.RpcServer: RpcServer.handler=17,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=16,port=60000] ipc.RpcServer: RpcServer.handler=16,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=14,port=60000] ipc.RpcServer: RpcServer.handler=14,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=13,port=60000] ipc.RpcServer: RpcServer.handler=13,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=12,port=60000] ipc.RpcServer: RpcServer.handler=12,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=11,port=60000] ipc.RpcServer: RpcServer.handler=11,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=10,port=60000] ipc.RpcServer: RpcServer.handler=10,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=9,port=60000] ipc.RpcServer: RpcServer.handler=9,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=8,port=60000] ipc.RpcServer: RpcServer.handler=8,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=7,port=60000] ipc.RpcServer: RpcServer.handler=7,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=6,port=60000] ipc.RpcServer: RpcServer.handler=6,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=5,port=60000] ipc.RpcServer: RpcServer.handler=5,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=4,port=60000] ipc.RpcServer: RpcServer.handler=4,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=3,port=60000] ipc.RpcServer: RpcServer.handler=3,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=2,port=60000] ipc.RpcServer: RpcServer.handler=2,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=1,port=60000] ipc.RpcServer: RpcServer.handler=1,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=0,port=60000] ipc.RpcServer: RpcServer.handler=0,port=60000: exiting
2014-02-24 10:11:32,930 INFO [master:hadoop-master:60000] mortbay.log: Stopped SelectChannelConnector#0.0.0.0:60010
2014-02-24 10:11:32,945 INFO [master:hadoop-master:60000] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1446360aa4a0001
2014-02-24 10:11:32,948 INFO [master:hadoop-master:60000] zookeeper.ZooKeeper: Session: 0x1446360aa4a0001 closed
2014-02-24 10:11:32,949 INFO [master:hadoop-master:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2014-02-24 10:11:32,954 INFO [master:hadoop-master:60000] zookeeper.ZooKeeper: Session: 0x1446360aa4a0000 closed
2014-02-24 10:11:32,954 INFO [master:hadoop-master:60000] master.HMaster: HMaster main thread exiting
2014-02-24 10:11:32,955 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2014-02-24 10:11:32,955 ERROR [main] master.HMasterCommandLine: Master exiting
**java.lang.RuntimeException: HMaster Aborted**
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:192)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2787)
Have you had upgrade from HBase 0.94.x to 0.96.x? They are pretty incompatible starting from migration to protobuffers as RPC mechanics and yes, changes in meta-tables approach.
Please be sure you have checked upgrade documentation.
http://hbase.apache.org/upgrading.html#upgrade0.96
Please pay special attention to ZooKeeper service.