Hmaster getting aborted after starting ./start-hbase.sh in hbase-0.96.0 with hadoop 2.2.0.
Tried with hbase-0.94.16 and hbase-0.98 but same result. Hmaster aborts as soon as it starts. Even tried with replacing jars in hbase lib manually as well as using maven but the issue is unresolved. Is there any other solution?
Below is the corresponding hbase-hadoop-master-hadoop-master.log...
2014-02-24 10:11:27,078 INFO [Replication.RpcServer.handler=2,port=60000] ipc.RpcServer: Replication.RpcServer.handler=2,port=60000: starting
2014-02-24 10:11:27,565 INFO [RpcServer.handler=23,port=60000] ipc.RpcServer: RpcServer.handler=23,port=60000: starting
2014-02-24 10:11:27,970 INFO [master:hadoop-master:60000] mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2014-02-24 10:11:28,172 INFO [master:hadoop-master:60000] http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2014-02-24 10:11:28,177 INFO [master:hadoop-master:60000] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context master
2014-02-24 10:11:28,177 INFO [master:hadoop-master:60000] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2014-02-24 10:11:28,191 INFO [master:hadoop-master:60000] http.HttpServer: Jetty bound to port 60010
2014-02-24 10:11:28,191 INFO [master:hadoop-master:60000] mortbay.log: jetty-6.1.26
2014-02-24 10:11:29,227 INFO [master:hadoop-master:60000] mortbay.log: Started SelectChannelConnector#0.0.0.0:60010
2014-02-24 10:11:29,623 INFO [master:hadoop-master:60000] master.ActiveMasterManager: Registered Active Master=hadoop-master.payoda.com,60000,1393236677609
2014-02-24 10:11:29,629 INFO [master:hadoop-master:60000] Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
2014-02-24 10:11:29,851 DEBUG [main-EventThread] master.ActiveMasterManager: A master is now available
2014-02-24 10:11:30,537 INFO [master:hadoop-master:60000] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2014-02-24 10:11:30,800 DEBUG [master:hadoop-master:60000] util.FSTableDescriptors: Current tableInfoPath = hdfs://hadoop-master:9000/hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001
2014-02-24 10:11:30,821 DEBUG [master:hadoop-master:60000] util.FSTableDescriptors: TableInfo already exists.. Skipping creation
2014-02-24 10:11:30,944 INFO [master:hadoop-master:60000] fs.HFileSystem: Added intercepting call to namenode#getBlockLocations so can do block reordering using class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks
2014-02-24 10:11:30,950 INFO [master:hadoop-master:60000] master.SplitLogManager: Timeout=120000, unassigned timeout=180000, distributedLogReplay=false
2014-02-24 10:11:30,956 INFO [master:hadoop-master:60000] master.SplitLogManager: Found 0 orphan tasks and 0 rescan nodes
2014-02-24 10:11:31,000 INFO [master:hadoop-master:60000] zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.14.35:2181 sessionTimeout=90000 watcher=hconnection-0x4a867fad
2014-02-24 10:11:31,012 INFO [master:hadoop-master:60000-SendThread(hadoop-master.payoda.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server hadoop-master.payoda.com/192.168.14.35:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-02-24 10:11:31,617 INFO [master:hadoop-master:60000-SendThread(hadoop-master.payoda.com:2181)] zookeeper.ClientCnxn: Socket connection established to hadoop-master.payoda.com/192.168.14.35:2181, initiating session
2014-02-24 10:11:31,617 INFO [master:hadoop-master:60000] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4a867fad connecting to ZooKeeper ensemble=192.168.14.35:2181
2014-02-24 10:11:31,620 INFO [master:hadoop-master:60000-SendThread(hadoop-master.payoda.com:2181)] zookeeper.ClientCnxn: Session establishment complete on server hadoop-master.payoda.com/192.168.14.35:2181, sessionid = 0x1446360aa4a0001, negotiated timeout = 90000
2014-02-24 10:11:31,640 DEBUG [master:hadoop-master:60000] catalog.CatalogTracker: Starting catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#3eaa3e5b
**2014-02-24 10:11:31,684 FATAL [master:hadoop-master:60000] master.HMaster: Unhandled exception. Starting shutdown.
java.lang.IllegalArgumentException: .META. no longer exists. The table has been renamed to hbase:meta**
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:292)
at org.apache.hadoop.hbase.zookeeper.ZKTable.populateTableStates(ZKTable.java:82)
at org.apache.hadoop.hbase.zookeeper.ZKTable.<init>(ZKTable.java:69)
at org.apache.hadoop.hbase.master.AssignmentManager.<init>(AssignmentManager.java:281)
at org.apache.hadoop.hbase.master.HMaster.initializeZKBasedSystemTrackers(HMaster.java:677)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:809)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:603)
at java.lang.Thread.run(Thread.java:662)
2014-02-24 10:11:31,684 INFO [master:hadoop-master:60000] master.HMaster: Aborting
2014-02-24 10:11:31,711 DEBUG [master:hadoop-master:60000] catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#3eaa3e5b
2014-02-24 10:11:31,712 DEBUG [master:hadoop-master:60000] master.HMaster: Stopping service threads
2014-02-24 10:11:31,712 INFO [master:hadoop-master:60000] ipc.RpcServer: Stopping server on 60000
2014-02-24 10:11:31,712 INFO [RpcServer.handler=15,port=60000] ipc.RpcServer: RpcServer.handler=15,port=60000: exiting
2014-02-24 10:11:31,712 INFO [RpcServer.handler=23,port=60000] ipc.RpcServer: RpcServer.handler=23,port=60000: exiting
2014-02-24 10:11:32,129 INFO [master:hadoop-master:60000] master.HMaster: Stopping infoServer
2014-02-24 10:11:32,138 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2014-02-24 10:11:32,138 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2014-02-24 10:11:32,304 INFO [hadoop-master.payoda.com,60000,1393236677609.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor: hadoop-master.payoda.com,60000,1393236677609.splitLogManagerTimeoutMonitor exiting
2014-02-24 10:11:32,304 INFO [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: stopping
2014-02-24 10:11:32,304 INFO [Replication.RpcServer.handler=2,port=60000] ipc.RpcServer: Replication.RpcServer.handler=2,port=60000: exiting
2014-02-24 10:11:32,304 INFO [Replication.RpcServer.handler=1,port=60000] ipc.RpcServer: Replication.RpcServer.handler=1,port=60000: exiting
2014-02-24 10:11:32,304 INFO [Replication.RpcServer.handler=0,port=60000] ipc.RpcServer: Replication.RpcServer.handler=0,port=60000: exiting
2014-02-24 10:11:32,304 INFO [RpcServer.handler=29,port=60000] ipc.RpcServer: RpcServer.handler=29,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=28,port=60000] ipc.RpcServer: RpcServer.handler=28,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=27,port=60000] ipc.RpcServer: RpcServer.handler=27,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=26,port=60000] ipc.RpcServer: RpcServer.handler=26,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=25,port=60000] ipc.RpcServer: RpcServer.handler=25,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=24,port=60000] ipc.RpcServer: RpcServer.handler=24,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=22,port=60000] ipc.RpcServer: RpcServer.handler=22,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=21,port=60000] ipc.RpcServer: RpcServer.handler=21,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=20,port=60000] ipc.RpcServer: RpcServer.handler=20,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=19,port=60000] ipc.RpcServer: RpcServer.handler=19,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=18,port=60000] ipc.RpcServer: RpcServer.handler=18,port=60000: exiting
2014-02-24 10:11:32,305 INFO [RpcServer.handler=17,port=60000] ipc.RpcServer: RpcServer.handler=17,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=16,port=60000] ipc.RpcServer: RpcServer.handler=16,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=14,port=60000] ipc.RpcServer: RpcServer.handler=14,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=13,port=60000] ipc.RpcServer: RpcServer.handler=13,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=12,port=60000] ipc.RpcServer: RpcServer.handler=12,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=11,port=60000] ipc.RpcServer: RpcServer.handler=11,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=10,port=60000] ipc.RpcServer: RpcServer.handler=10,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=9,port=60000] ipc.RpcServer: RpcServer.handler=9,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=8,port=60000] ipc.RpcServer: RpcServer.handler=8,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=7,port=60000] ipc.RpcServer: RpcServer.handler=7,port=60000: exiting
2014-02-24 10:11:32,306 INFO [RpcServer.handler=6,port=60000] ipc.RpcServer: RpcServer.handler=6,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=5,port=60000] ipc.RpcServer: RpcServer.handler=5,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=4,port=60000] ipc.RpcServer: RpcServer.handler=4,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=3,port=60000] ipc.RpcServer: RpcServer.handler=3,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=2,port=60000] ipc.RpcServer: RpcServer.handler=2,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=1,port=60000] ipc.RpcServer: RpcServer.handler=1,port=60000: exiting
2014-02-24 10:11:32,307 INFO [RpcServer.handler=0,port=60000] ipc.RpcServer: RpcServer.handler=0,port=60000: exiting
2014-02-24 10:11:32,930 INFO [master:hadoop-master:60000] mortbay.log: Stopped SelectChannelConnector#0.0.0.0:60010
2014-02-24 10:11:32,945 INFO [master:hadoop-master:60000] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1446360aa4a0001
2014-02-24 10:11:32,948 INFO [master:hadoop-master:60000] zookeeper.ZooKeeper: Session: 0x1446360aa4a0001 closed
2014-02-24 10:11:32,949 INFO [master:hadoop-master:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2014-02-24 10:11:32,954 INFO [master:hadoop-master:60000] zookeeper.ZooKeeper: Session: 0x1446360aa4a0000 closed
2014-02-24 10:11:32,954 INFO [master:hadoop-master:60000] master.HMaster: HMaster main thread exiting
2014-02-24 10:11:32,955 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2014-02-24 10:11:32,955 ERROR [main] master.HMasterCommandLine: Master exiting
**java.lang.RuntimeException: HMaster Aborted**
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:192)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2787)
Have you had upgrade from HBase 0.94.x to 0.96.x? They are pretty incompatible starting from migration to protobuffers as RPC mechanics and yes, changes in meta-tables approach.
Please be sure you have checked upgrade documentation.
http://hbase.apache.org/upgrading.html#upgrade0.96
Please pay special attention to ZooKeeper service.
Related
I'm trying to export data from teradata to hadoop. but my export query is failing by giving an error "Failed to write data".Please look at the Mapreduce and application logs below:
Log Type: syslog
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 4931
2016-03-08 22:47:07,414 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2016-03-08 22:47:07,499 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-03-08 22:47:07,499 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2016-03-08 22:47:07,509 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2016-03-08 22:47:07,510 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1457504560070_0004, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier#175b9425)
2016-03-08 22:47:07,556 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: RM_DELEGATION_TOKEN, Service: 39.7.48.2:8032,39.7.48.3:8032, Ident: (owner=hive, renewer=oozie mr token, realUser=oozie, issueDate=1457506410968, maxDate=1458111210968, sequenceNumber=908, masterKeyId=280)
2016-03-08 22:47:07,599 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2016-03-08 22:47:07,848 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /data1/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data2/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data3/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data4/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data5/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data6/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data7/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data8/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data9/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data10/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004,/data12/hadoop/yarn/local/usercache/hive/appcache/application_1457504560070_0004
2016-03-08 22:47:08,132 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2016-03-08 22:47:08,623 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2016-03-08 22:47:08,840 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: com.teradata.dynaload.hcatalog.mapper.TDInputFormat$TeradataInputSplit#2ece4966
2016-03-08 22:47:08,844 INFO [main] com.teradata.dynaload.hcatalog.mapper.TDInputFormat$TeradataRecordReader: recordreader class com.teradata.dynaload.hcatalog.mapper.TDInputFormat$TeradataRecordReaderinitialize time is: 1457506028844
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 300417020(1201668080)
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 1146
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 841167680
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 1201668096
2016-03-08 22:47:09,512 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 300417020; length = 75104256
2016-03-08 22:47:09,515 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2016-03-08 22:47:09,518 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
2016-03-08 22:47:09,518 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2016-03-08 22:47:09,848 WARN [main] org.apache.hadoop.hive.conf.HiveConf: HiveConf of name hive.metastore.local does not exist
2016-03-08 22:47:09,914 INFO [main] hive.metastore: Trying to connect to metastore with URI thrift://apus2.labs.teradata.com:9083
2016-03-08 22:47:09,951 INFO [main] hive.metastore: Connected to metastore.
2016-03-08 22:47:10,407 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2016-03-08 22:47:10,452 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
2016-03-08 22:47:10,453 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.work.output.dir is deprecated. Instead, use mapreduce.task.output.dir
2016-03-08 22:47:10,453 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
2016-03-08 22:47:10,457 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
APPLICATION Master LOGS:
Log Type: stderr
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 240
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Log Type: stdout
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 0
Log Type: syslog
Log Upload Time: Tue Mar 08 22:59:27 -0800 2016
Log Length: 66959
Showing 4096 bytes of 66959 total. Click here for the full log.
ILED
2016-03-08 22:59:19,325 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event JOB_FAILED
2016-03-08 22:59:19,456 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://C423A:8020/user/hive/.staging/job_1457504560070_0004/job_1457504560070_0004_1.jhist to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist_tmp
2016-03-08 22:59:19,550 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist_tmp
2016-03-08 22:59:19,562 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://C423A:8020/user/hive/.staging/job_1457504560070_0004/job_1457504560070_0004_1_conf.xml to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml_tmp
2016-03-08 22:59:19,614 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml_tmp
2016-03-08 22:59:19,645 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004.summary_tmp to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004.summary
2016-03-08 22:59:19,654 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml_tmp to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004_conf.xml
2016-03-08 22:59:19,666 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist_tmp to hdfs://C423A:8020/mr-history/tmp/hive/job_1457504560070_0004-1457506422934-hive-oozie%3Aaction%3AT%3Djava%3AW%3DTDExportMR%3AA%3Dexport%3AID%3D00001-1457506759193-0-0-FAILED-default-1457506429243.jhist
2016-03-08 22:59:19,666 INFO [Thread-89] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2016-03-08 22:59:19,671 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to Task failed task_1457504560070_0004_m_000004
Job failed as tasks failed. failedMaps:1 failedReduces:0
2016-03-08 22:59:19,672 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: History url is http://apus2.labs.teradata.com:19888/jobhistory/job/job_1457504560070_0004
2016-03-08 22:59:19,680 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Waiting for application to be successfully unregistered.
2016-03-08 22:59:20,682 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:7 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:7 ContRel:0 HostLocal:6 RackLocal:1
2016-03-08 22:59:20,684 INFO [Thread-89] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://C423A /user/hive/.staging/job_1457504560070_0004
2016-03-08 22:59:20,711 INFO [Thread-89] org.apache.hadoop.ipc.Server: Stopping server on 46067
2016-03-08 22:59:20,712 INFO [IPC Server listener on 46067] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 46067
2016-03-08 22:59:20,712 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-03-08 22:59:20,714 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted.
Plese help me in resolving the issue.
You must be using sqoop to bring data to hadoop. Please paste command you are running. For "Failed to write data" , there can be multiple issues. destination parent directory is not avialble, space is not there at cluster etc.Only command can give explanation.
I can run SparkPi example on the master node, but when I try the same command
"spark-submit --class SparkPi --master yarn-client sparkpi.jar 10"
on the slave node, I got an error:
2015-05-19 14:05:44,881 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing view acls to: maintainer
2015-05-19 14:05:44,886 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - Changing modify acls to: maintainer
2015-05-19 14:05:44,887 INFO [main] spark.SecurityManager (Logging.scala:logInfo(59)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(maintainer); users with modify permissions: Set(maintainer)
2015-05-19 14:05:45,389 INFO [sparkDriver-akka.actor.default-dispatcher-4] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started
2015-05-19 14:05:45,443 INFO [sparkDriver-akka.actor.default-dispatcher-4] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting
2015-05-19 14:05:45,641 INFO [sparkDriver-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriver#slave2.com:33055]
2015-05-19 14:05:45,644 INFO [sparkDriver-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriver#slave2.com:33055]
2015-05-19 14:05:45,653 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'sparkDriver' on port 33055.
2015-05-19 14:05:45,674 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering MapOutputTracker
2015-05-19 14:05:45,688 INFO [main] spark.SparkEnv (Logging.scala:logInfo(59)) - Registering BlockManagerMaster
2015-05-19 14:05:45,707 INFO [main] storage.DiskBlockManager (Logging.scala:logInfo(59)) - Created local directory at /tmp/spark-local-20150519140545-c81b
2015-05-19 14:05:45,712 INFO [main] storage.MemoryStore (Logging.scala:logInfo(59)) - MemoryStore started with capacity 265.4 MB
2015-05-19 14:05:46,205 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-05-19 14:05:46,408 INFO [main] spark.HttpFileServer (Logging.scala:logInfo(59)) - HTTP File server directory is /tmp/spark-e95a2b5b-efea-41eb-93b9-0a9f7d6f6701
2015-05-19 14:05:46,413 INFO [main] spark.HttpServer (Logging.scala:logInfo(59)) - Starting HTTP Server
2015-05-19 14:05:46,477 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-05-19 14:05:46,499 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SocketConnector#0.0.0.0:52737
2015-05-19 14:05:46,500 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'HTTP file server' on port 52737.
2015-05-19 14:05:46,790 INFO [main] server.Server (Server.java:doStart(272)) - jetty-8.y.z-SNAPSHOT
2015-05-19 14:05:46,805 INFO [main] server.AbstractConnector (AbstractConnector.java:doStart(338)) - Started SelectChannelConnector#0.0.0.0:4040
2015-05-19 14:05:46,805 INFO [main] util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'SparkUI' on port 4040.
2015-05-19 14:05:46,808 INFO [main] ui.SparkUI (Logging.scala:logInfo(59)) - Started SparkUI at http://slave2.com:4040
2015-05-19 14:05:47,058 INFO [main] spark.SparkContext (Logging.scala:logInfo(59)) - Added JAR file:/home/maintainer/myjars/sparkpi.jar at http://[ip]:52737/jars/sparkpi.jar with timestamp 1432033547057
2015-05-19 14:05:47,190 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
2015-05-19 14:09:45,861 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
**2015-05-19 14:09:47,067 INFO [main] ipc.Client (Client.java:handleConnectionFailure(842)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-05-19 14:09:48,068 INFO [main] ipc.Client (Client.java:handleConnectionFailure(842)) - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
...**
Aside from specifying yarn.resourcemanager.hostname property in yarn-site.xml, it's also necessary to propagate configuration files to workers.
It might be done with this line (before running spark-submit):
export SPARK_YARN_DIST_FILES=$(ls $HADOOP_CONF_DIR* | sed 's#^#file://#g' | tr '\n' ',' | sed 's/,$//')
If everything's configured correctly, you'll see RM hostname instead of 0.0.0.0 in this line:
2015-05-19 14:05:47,190 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032
Exporting correct values for HADOOP_CONF_DIR fixed the issue.
export HADOOP_CONF_DIR=/your-path/hadoop/conf
Previously I have run spark job using command line and spark-submit command.
sudo -su root spark-submit --executor-memory 512m --num-executors 50 --class com.mycompany.project.SparkHdfsToHBase --master yarn parser-1.0-jar-with-dependencies.jar 0 hdfs://company.com:8020/tmp/testfiles/html_files/* 127.0.0.1 --verbose
And everything works like a charm.
Now I am trying to run spark job through oozie workflow. I have created workflow and add shell action with a python script.
Python script looks like:
#!/usr/bin/env python
import subprocess
subprocess.call(["spark-submit", "--class", "com.mycompany.project.SparkHdfsToHBase", "--executor-memory", "512m", "--driver-memory", "512m", "--master", "yarn", "file:///parser-1.0-jar-with-dependencies.jar", "0", "hdfs://company.com:8020/tmp/testfiles/html_files/*", "127.0.0.1", "--verbose"])
I haven't added any other parameters for job, only path to file. When I am submitting job, at job browser my job is stucked on 5%.
I haven't got any useful information from logs:
Syslog:
2015-03-16 11:46:51,152 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-03-16 11:46:51,152 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2015-03-16 11:46:51,154 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2015-03-16 11:46:51,157 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2015-03-16 11:46:51,172 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-03-16 11:46:51,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2015-03-16 11:46:51,354 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier#46ea3050)
2015-03-16 11:46:51,393 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: RM_DELEGATION_TOKEN, Service: 172.24.4.231:8032, Ident: (owner=admin, renewer=oozie mr token, realUser=oozie, issueDate=1426499201955, maxDate=1427104001955, sequenceNumber=6, masterKeyId=2)
2015-03-16 11:46:51,631 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2015-03-16 11:46:51,633 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-03-16 11:46:51,637 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2015-03-16 11:46:51,640 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2015-03-16 11:46:51,644 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2015-03-16 11:46:51,661 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-03-16 11:46:52,495 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-03-16 11:46:52,803 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2015-03-16 11:46:52,805 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
2015-03-16 11:46:52,868 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2015-03-16 11:46:52,869 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2015-03-16 11:46:52,870 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2015-03-16 11:46:52,871 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2015-03-16 11:46:52,872 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2015-03-16 11:46:52,873 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2015-03-16 11:46:52,874 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2015-03-16 11:46:52,875 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2015-03-16 11:46:52,967 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2015-03-16 11:46:53,241 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-03-16 11:46:53,303 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-03-16 11:46:53,303 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2015-03-16 11:46:53,320 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1426498753001_0001 to jobTokenSecretManager
2015-03-16 11:46:53,447 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1426498753001_0001 because: not enabled;
2015-03-16 11:46:53,473 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1426498753001_0001 = 0. Number of splits = 1
2015-03-16 11:46:53,473 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1426498753001_0001 = 0
2015-03-16 11:46:53,473 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1426498753001_0001Job Transitioned from NEW to INITED
2015-03-16 11:46:53,474 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1426498753001_0001.
2015-03-16 11:46:53,532 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-03-16 11:46:53,542 INFO [Socket Reader #1 for port 44690] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 44690
2015-03-16 11:46:53,564 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2015-03-16 11:46:53,565 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-03-16 11:46:53,566 INFO [IPC Server listener on 44690] org.apache.hadoop.ipc.Server: IPC Server listener on 44690: starting
2015-03-16 11:46:53,566 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at company.com/172.24.4.231:44690
2015-03-16 11:46:53,641 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-03-16 11:46:53,646 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2015-03-16 11:46:53,658 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-03-16 11:46:53,663 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2015-03-16 11:46:53,663 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2015-03-16 11:46:53,667 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2015-03-16 11:46:53,667 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2015-03-16 11:46:53,678 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 57328
2015-03-16 11:46:53,678 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2015-03-16 11:46:53,706 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/jars/hadoop-yarn-common-2.5.0-cdh5.3.1.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_57328_mapreduce____37uqf6/webapp
2015-03-16 11:46:54,346 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:57328
2015-03-16 11:46:54,346 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 57328
2015-03-16 11:46:55,005 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2015-03-16 11:46:55,012 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-03-16 11:46:55,012 INFO [Socket Reader #1 for port 60762] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 60762
2015-03-16 11:46:55,021 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-03-16 11:46:55,023 INFO [IPC Server listener on 60762] org.apache.hadoop.ipc.Server: IPC Server listener on 60762: starting
2015-03-16 11:46:55,111 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2015-03-16 11:46:55,111 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2015-03-16 11:46:55,111 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2015-03-16 11:46:55,369 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2015-03-16 11:46:55,369 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-03-16 11:46:55,369 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2015-03-16 11:46:55,370 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2015-03-16 11:46:55,375 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2015-03-16 11:46:55,383 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-03-16 11:46:55,388 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at company.com/172.24.4.231:8030
2015-03-16 11:46:55,574 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:8192, vCores:2>
2015-03-16 11:46:55,574 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.admin
2015-03-16 11:46:55,582 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2015-03-16 11:46:55,584 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
2015-03-16 11:46:55,599 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1426498753001_0001Job Transitioned from INITED to SETUP
2015-03-16 11:46:55,611 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2015-03-16 11:46:55,633 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1426498753001_0001Job Transitioned from SETUP to RUNNING
2015-03-16 11:46:55,714 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1426498753001_0001_m_000000 Task Transitioned from NEW to SCHEDULED
2015-03-16 11:46:55,723 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1426498753001_0001_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2015-03-16 11:46:55,738 INFO [Thread-50] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2015-03-16 11:46:55,797 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1426498753001_0001, File: hdfs://company.com:8020/user/admin/.staging/job_1426498753001_0001/job_1426498753001_0001_1.jhist
2015-03-16 11:46:56,584 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2015-03-16 11:46:56,658 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1426498753001_0001: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:7168, vCores:0> knownNMs=1
Stderr:
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
Mar 16, 2015 11:47:04 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Mar 16, 2015 11:47:04 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Mar 16, 2015 11:47:05 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Mar 16, 2015 11:47:05 AM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to #Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Mar 16, 2015 11:47:05 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
I'll be happy to get any help. Thanks.
I'm getting the below error while submiting spark submit query. can any one please suggest how to resolve this issue
15/02/18 12:06:17 INFO network.ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl#5173169
java.nio.channels.CancelledKeyException
at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386)
at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)
15/02/18 12:06:17 ERROR network.ConnectionManager: Corresponding SendingConnection to ConnectionManagerId(bkcttplpd037.verizon.com,39010) not found
15/02/18 12:06:17 INFO network.ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl#7a73a542
15/02/18 12:06:17 INFO network.ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl#7a73a542
java.nio.channels.CancelledKeyException
at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:310)
at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)
15/02/18 12:06:18 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
15/02/18 12:06:18 INFO network.ConnectionManager: Selector thread was interrupted!
15/02/18 12:06:18 INFO network.ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(abc02.com,49740)
15/02/18 12:06:18 ERROR network.ConnectionManager: Corresponding SendingConnection to ConnectionManagerId(abc01.com,49740) not found
15/02/18 12:06:18 WARN network.ConnectionManager: All connections not cleaned up
15/02/18 12:06:18 INFO network.ConnectionManager: ConnectionManager stopped
15/02/18 12:06:18 INFO storage.MemoryStore: MemoryStore cleared
15/02/18 12:06:18 INFO storage.BlockManager: BlockManager stopped
15/02/18 12:06:18 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/02/18 12:06:18 INFO spark.SparkContext: Successfully stopped SparkContext
15/02/18 12:06:18 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.15/02/18 12:06:18 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
I'm using Hadoop2.4.0 / Hbase 0.98.0 / Hive 0.14.0
Hadoop and HBase were running fine until I restarted my HMaster. The following error appears in hbase-hduser-master-master.log file :
2015-02-17 05:46:15,157 INFO [master:master:60000] master.TableNamespaceManager: Namespace table not found. Creating...
2015-02-17 05:46:15,193 DEBUG [master:master:60000] lock.ZKInterProcessLockBase: Acquired a lock for /hbase/table-lock/hbase:namespace/write-master:600000000000004
2015-02-17 05:46:15,212 DEBUG [master:master:60000] lock.ZKInterProcessLockBase: Released /hbase/table-lock/hbase:namespace/write-master:600000000000004
2015-02-17 05:46:15,212 FATAL [master:master:60000] master.HMaster: Master server abort: loaded coprocessors are: []
2015-02-17 05:46:15,213 FATAL [master:master:60000] master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(CreateTableHandler.java:120)
at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceTable(TableNamespaceManager.java:232)
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:86)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:1049)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:913)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:606)
at java.lang.Thread.run(Unknown Source)
2015-02-17 05:46:15,214 INFO [master:master:60000] master.HMaster: Aborting
2015-02-17 05:46:15,214 INFO [master,60000,1424180766819-BalancerChore] balancer.BalancerChore: master,60000,1424180766819-BalancerChore exiting
2015-02-17 05:46:15,215 INFO [master,60000,1424180766819-ClusterStatusChore] balancer.ClusterStatusChore: master,60000,1424180766819-ClusterStatusChore exiting
2015-02-17 05:46:15,215 INFO [CatalogJanitor-master:60000] master.CatalogJanitor: CatalogJanitor-master:60000 exiting
2015-02-17 05:46:15,216 DEBUG [master:master:60000] master.HMaster: Stopping service threads
2015-02-17 05:46:15,216 INFO [master:master:60000] ipc.RpcServer: Stopping server on 60000
2015-02-17 05:46:15,216 INFO [RpcServer.listener,port=60000] ipc.RpcServer: RpcServer.listener,port=60000: stopping
2015-02-17 05:46:15,218 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2015-02-17 05:46:15,218 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2015-02-17 05:46:15,218 INFO [master:master:60000.oldLogCleaner] cleaner.LogCleaner: master:master:60000.oldLogCleaner exiting
2015-02-17 05:46:15,218 INFO [master:master:60000.oldLogCleaner] master.ReplicationLogCleaner: Stopping replicationLogCleaner-0x14b97c83f580008, quorum=slave:2181,master:2181, baseZNode=/hbase
2015-02-17 05:46:15,219 INFO [master:master:60000.archivedHFileCleaner] cleaner.HFileCleaner: master:master:60000.archivedHFileCleaner exiting
2015-02-17 05:46:15,219 INFO [master:master:60000] master.HMaster: Stopping infoServer
2015-02-17 05:46:15,223 INFO [master:master:60000.oldLogCleaner] zookeeper.ZooKeeper: Session: 0x14b97c83f580008 closed
2015-02-17 05:46:15,223 INFO [master:master:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-02-17 05:46:15,229 INFO [master:master:60000] mortbay.log: Stopped SelectChannelConnector#0.0.0.0:60010
2015-02-17 05:46:15,236 DEBUG [master:master:60000] catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker#19f9598
2015-02-17 05:46:15,236 INFO [master:master:60000] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x14b97c83f580007
2015-02-17 05:46:15,237 INFO [master:master:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-02-17 05:46:15,238 INFO [master:master:60000] zookeeper.ZooKeeper: Session: 0x14b97c83f580007 closed
2015-02-17 05:46:15,238 INFO [master,60000,1424180766819.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor: master,60000,1424180766819.splitLogManagerTimeoutMonitor exiting
2015-02-17 05:46:15,243 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-02-17 05:46:15,243 INFO [master:master:60000] zookeeper.ZooKeeper: Session: 0x14b97c83f580006 closed
2015-02-17 05:46:15,243 INFO [master:master:60000] master.HMaster: HMaster main thread exiting
2015-02-17 05:46:15,243 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:192)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2785)
what's wrong here and what HMaster Aborted means ?
for more information this is what my hbase-site.xml looks like:
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/hbase/zookeeper</value>
</property>
I ran into this problem today! My solution is as follows:
Step 1:stop Hbase.
Step 2:run the follow command
hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
This command is used to repair MetaData of Hbase
Step 3:delete the data in zookeeper (WARNING It will make you lost you old data)
./opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/zookeeper/bin/zkCli.sh
you can use ls / to scan the data in zookeeper
use rmr /hbase to delete the hbase's data in zookeeper
Step 4:Start hbase
This is based on the other answer but to clarify for upgrade cloudera 5.4
Step 1:
service hbase-regionserver stop
service hbase-master stop
Step 2:
hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
Step 3: Delete the data in zookeeper (WARNING It will make you lost your old data)
cd /usr/lib/zookeeper/bin/
./zkCli.sh
It opens up the zookeeper shell.
Then run:
ls /
rmr /hbase
Step 4:Start hbase
service hbase-master restart
service hbase-regionserver restart