ERROR org.apache.hadoop.conf.Configuration: error parsing conf mapred-site.xml - hadoop

at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2924)
Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </name>; expected </nafme>.
at [row,col,system-id]: [39,40,"file:/opt/module/hadoop-3.1.3/etc/hadoop/mapred-site.xml"]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:475)
at com.ctc.wstx.sr.BasicStreamReader.reportWrongEndElem(BasicStreamReader.java:3365)
at com.ctc.wstx.sr.BasicStreamReader.readEndElem(BasicStreamReader.java:3292)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2911)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3320)
at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3114)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3007)
... 14 more
2022-12-02 13:21:18,536 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </name>; expected </nafme>.
at [row,col,system-id]: [39,40,"file:/opt/module/hadoop-3.1.3/etc/hadoop/mapred-site.xml"]
2022-12-02 13:21:18,551 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hadoop162/192.168.10.162
************************************************************/
2022-12-02 13:21:18,597 ERROR org.apache.hadoop.conf.Configuration: error parsing conf mapred-site.xml
com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </name>; expected </nafme>.
at [row,col,system-id]: [39,40,"file:/opt/module/hadoop-3.1.3/etc/hadoop/mapred-site.xml"]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:475)
at com.ctc.wstx.sr.BasicStreamReader.reportWrongEndElem(BasicStreamReader.java:3365)
at com.ctc.wstx.sr.BasicStreamReader.readEndElem(BasicStreamReader.java:3292)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2911)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3320)
at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3114)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3007)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200)
at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812)
at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789)
at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
/opt/module/hadoop-3.1.3/logs » xcall atguigu#hadoop162
========== hadoop162 =========
3642 Jps
========== hadoop163 =========
3047 NodeManager
2603 DataNode
2893 ResourceManager
3503 Jps
========== hadoop164 =========
1191 DataNode
1368 NodeManager
1597 Jps
/opt/module/hadoop-3.1.3/logs » atguigu#hadoop162
enter image description here
there is an exception ,I can't start hadoop ,
022-12-02 13:21:18,597 ERROR org.apache.hadoop.conf.Configuration: error parsing conf mapred-site.xml
com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </name>; expected </nafme>.
at [row,col,system-id]: [39,40,"file:/opt/module/hadoop-3.1.3/etc/hadoop/mapred-site.xml"]

As the error says, you have mismatched XML tags.
nafme isn't a valid property tag

Related

Execute hive query cause yarn resource manager to throw file does not exist exception

I'm configuring hive 3.1.0 to work with hadoop 3.0.0.
This error throw almost immediately when I submit a simple query on beeline that cause map reduce
0: jdbc:hive2://> select count(*) from airlinedata;
18/10/11 10:24:45 [HiveServer2-Background-Pool: Thread-124]: WARN ql.Driver: Hive-on-MR is deprecated in Hive 2 and may not be available in the futureversions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = UUT81HC_20181011102444_2df01ff5-ca05-403c-b0e1-15f8f7715dc7
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
2018-10-11 10:24:45,510 INFO [HiveServer2-Background-Pool: Thread-124] client.RMProxy (RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at /10.184.153.232:8032
2018-10-11 10:24:45,555 INFO [HiveServer2-Background-Pool: Thread-124] client.RMProxy (RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at /10.184.153.232:8032
18/10/11 10:24:45 [HiveServer2-Background-Pool: Thread-124]: WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
WARN : Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73)
at org.apache.hadoop.mapreduce.TypeConverter.toYarn(TypeConverter.java:78)
at org.apache.hadoop.mapred.ClientServiceDelegate.(ClientServiceDelegate.java:120)
at org.apache.hadoop.mapred.ClientCache.getClient(ClientCache.java:68)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:343)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70)
... 40 more
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual
Reason:
Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
Current Frame:
bci: #36
flags: { }
locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
Bytecode:
0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7
0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a
0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5
0x0000030: 0013 2ab0
Stackmap Table:
same_frame(#19)
same_frame(#31)
same_frame(#40)
at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017)
at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.(JobIdPBImpl.java:37)
... 45 more
yarn resoucemanager stacktrace
2018-10-11 10:24:49,896 INFO rmapp.RMAppImpl: application_1539226955170_0002 State change from ACCEPTED to FINAL_SAVING on event = ATTEMPT_FAILED
2018-10-11 10:24:49,896 INFO recovery.RMStateStore: Updating info for app: application_1539226955170_0002
2018-10-11 10:24:49,897 INFO capacity.CapacityScheduler: Application Attempt appattempt_1539226955170_0002_000002 is done. finalState=FAILED
2018-10-11 10:24:49,897 INFO rmapp.RMAppImpl: Application application_1539226955170_0002 failed 2 times due to AM Container for appattempt_1539226955170_0002_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2018-10-11 10:24:49.876]File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1488)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1503)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002 Then click on links to logs of each attempt.
. Failing the application.
2018-10-11 10:24:49,897 INFO scheduler.AppSchedulingInfo: Application application_1539226955170_0002 requests cleared
2018-10-11 10:24:49,897 INFO rmapp.RMAppImpl: application_1539226955170_0002 State change from FINAL_SAVING to FAILED on event = APP_UPDATE_SAVED
2018-10-11 10:24:49,898 INFO capacity.LeafQueue: Application removed - appId: application_1539226955170_0002 user: UUT81HC queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2018-10-11 10:24:49,898 WARN resourcemanager.RMAuditLogger: USER=UUT81HC OPERATION=Application Finished - Failed
TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1539226955170_0002 failed 2 times due to AM Container for appattempt_1539226955170_0002_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2018-10-11 10:24:49.876]File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1488)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1503)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002 Then click on links to logs of each attempt.
. Failing the application. APPID=application_1539226955170_0002
2018-10-11 10:24:49,898 INFO capacity.ParentQueue: Application removed - appId: application_1539226955170_0002 user: UUT81HC leaf-queue of parent: root #applications: 0
2018-10-11 10:24:49,899 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1539226955170_0002,name=select count(*) from airlinedata (Stage-1),user=UUT81HC,queue=default,state=FAILED,trackingUrl=http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002,appMasterHost=N/A,submitTime=1539228287412,startTime=1539228287413,finishTime=1539228289896,finalStatus=FAILED,memorySeconds=1482,vcoreSeconds=0,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=,applicationType=MAPREDUCE,resourceSeconds=1482 MB-seconds\, 0 vcore-seconds,preemptedResourceSeconds=0 MB-seconds\, 0 vcore-seconds
After examine how hive execute mapreduce job on yarn, I found that it first it create map.xml and reduce.xml in /tmp with permission drwx------ (only owner can use it)
2018-10-11 10:24:45,133 INFO hdfs.StateChange: BLOCK* allocate blk_1073742318_1495, replicas=10.184.153.232:9866 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml
2018-10-11 10:24:45,225 INFO hdfs.StateChange: DIR* completeFile: /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:45,248 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml
2018-10-11 10:24:45,294 INFO hdfs.StateChange: BLOCK* allocate blk_1073742319_1496, replicas=10.184.153.232:9866 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
2018-10-11 10:24:45,411 INFO hdfs.StateChange: DIR* completeFile: /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:45,437 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
2018-10-11 10:24:45,772 INFO hdfs.StateChange: BLOCK* allocate blk_1073742320_1497, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar
2018-10-11 10:24:46,438 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,463 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar
2018-10-11 10:24:46,618 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split
2018-10-11 10:24:46,639 INFO hdfs.StateChange: BLOCK* allocate blk_1073742321_1498, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split
2018-10-11 10:24:46,706 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,791 INFO hdfs.StateChange: BLOCK* allocate blk_1073742322_1499, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.splitmetainfo
2018-10-11 10:24:46,870 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.splitmetainfo is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,971 INFO hdfs.StateChange: BLOCK* allocate blk_1073742323_1500, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.xml
2018-10-11 10:24:47,370 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:32:15,741 INFO blockmanagement.BlockManager: StorageInfo TreeSet fill ratio DS-d4c2a5a0-435d-4b44-b408-3cd04587cd09 : 1.0
But somehow yarn can't read that when executing job and throw out file does not exist. I did set permission 777 on /tmp but this file is self create by hive in executing process so I can't do anything with it.
I doubt that this problem is something related to user or permission when using hive in hadoop. What should I do with this?

job.xml - Character reference "&#0" is an invalid XML character

We are getting a strange job.xml error while running a mapreduce task. I am not able to get to the job.xml as well. The following is my environment.
Hadoop - 2.6.2
Hive - 1.1.0
2017-03-10 02:13:11,788 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1488810291982_0019_000002
2017-03-10 02:13:11,953 FATAL [main] org.apache.hadoop.conf.Configuration: error parsing conf job.xml
org.xml.sax.SAXParseException; systemId: file:///tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1488810291982_0019/container_1488810291982_0019_02_000001/job.xml; lineNumber: 1027; columnNumber: 141; Character reference "&#0" is an invalid XML character.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2432)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2501)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2454)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1188)
at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:51)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1421)
2017-03-10 02:13:11,955 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:///tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1488810291982_0019/container_1488810291982_0019_02_000001/job.xml; lineNumber: 1027; columnNumber: 141; Character reference "&#0" is an invalid XML character.
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2597)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2454)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1188)
at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:51)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1421)

Table not fount exception (E0729) , while executing hive query from oozie workflow

Script_SusRes.q
select * from ufo_session_details limit 5
Workflow_SusRes.xml
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="hive-wf">
<start to="hive-node"/>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<script>Script_SusRes.q</script>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
SusRes.properties
oozieClientUrl=http://zltv5636.vci.att.com:11000/oozie
nameNode=hdfs://zltv5635.vci.att.com:8020
jobTracker=zltv5636.vci.att.com:50300
queueName=default
userName=wfe
oozie.use.system.libpath=true
oozie.libpath = ${nameNode}/tmp/nt283s
oozie.wf.application.path=/tmp/nt283s/workflow_SusRes.xml
Error Log
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [10001] Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher ends
stderr logs
Logging initialized using configuration in file:/opt/app/workload/hadoop/mapred/local/taskTracker/wfe/jobcache/job_201510130626_0451/attempt_201510130626_0451_m_000000_0/work/hive-log4j.properties
FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'ufo_session_details'
Intercepting System.exit(10001)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [10001]
syslog logs
2015-11-03 00:26:20,599 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2015-11-03 00:26:20,902 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/distcache/8045442539840332845_326451332_1282624021/zltv5635.vci.att.com/tmp/nt283s/Script_SusRes.q <- /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/jobcache/job_201510130626_0451/attempt_201510130626_0451_m_000000_0/work/Script_SusRes.q
2015-11-03 00:26:20,911 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/distcache/3435440518513182209_187825668_1219418250/zltv5635.vci.att.com/tmp/nt283s/Script_SusRes.sql <- /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/jobcache/job_201510130626_0451/attempt_201510130626_0451_m_000000_0/work/Script_SusRes.sql
2015-11-03 00:26:20,913 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/distcache/-5883507949569818012_2054276612_1203833745/zltv5635.vci.att.com/tmp/nt283s/lib <- /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/jobcache/job_201510130626_0451/attempt_201510130626_0451_m_000000_0/work/lib
2015-11-03 00:26:20,916 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/distcache/6682880817470643170_1186359172_1225814386/zltv5635.vci.att.com/tmp/nt283s/workflow_SusRes.xml <- /opt/app/workload/hadoop/mapred/local/taskTracker/wfe/jobcache/job_201510130626_0451/attempt_201510130626_0451_m_000000_0/work/workflow_SusRes.xml
2015-11-03 00:26:21,441 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2015-11-03 00:26:21,448 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#698cdde3
2015-11-03 00:26:21,602 INFO org.apache.hadoop.mapred.MapTask: Processing split: hdfs://zltv5635.vci.att.com:8020/user/wfe/oozie-oozi/0000088-151013062722898-oozie-oozi-W/hive-node--hive/input/dummy.txt:0+5
2015-11-03 00:26:21,630 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
2015-11-03 00:26:21,635 INFO com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
2015-11-03 00:26:21,652 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available
2015-11-03 00:26:21,652 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library loaded
2015-11-03 00:26:21,663 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
2015-11-03 00:26:22,654 INFO SessionState:
Logging initialized using configuration in file:/opt/app/workload/hadoop/mapred/local/taskTracker/wfe/jobcache/job_201510130626_0451/attempt_201510130626_0451_m_000000_0/work/hive-log4j.properties
2015-11-03 00:26:22,910 INFO org.apache.hadoop.hive.ql.Driver: <PERFLOG method=Driver.run>
2015-11-03 00:26:22,911 INFO org.apache.hadoop.hive.ql.Driver: <PERFLOG method=TimeToSubmit>
2015-11-03 00:26:22,912 INFO org.apache.hadoop.hive.ql.Driver: <PERFLOG method=compile>
2015-11-03 00:26:22,998 INFO hive.ql.parse.ParseDriver: Parsing command: select * from ufo_session_details limit 5
2015-11-03 00:26:23,618 INFO hive.ql.parse.ParseDriver: Parse Completed
2015-11-03 00:26:23,799 INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer: Starting Semantic Analysis
2015-11-03 00:26:23,802 INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
2015-11-03 00:26:23,802 INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer: Get metadata for source tables
2015-11-03 00:26:23,990 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2015-11-03 00:26:24,031 INFO org.apache.hadoop.hive.metastore.ObjectStore: ObjectStore, initialize called
2015-11-03 00:26:24,328 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
2015-11-03 00:26:28,112 INFO org.apache.hadoop.hive.metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2015-11-03 00:26:28,169 INFO org.apache.hadoop.hive.metastore.ObjectStore: Initialized ObjectStore
2015-11-03 00:26:30,767 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: 0: get_table : db=default tbl=ufo_session_details
2015-11-03 00:26:30,768 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: ugi=wfe ip=unknown-ip-addr cmd=get_table : db=default tbl=ufo_session_details
2015-11-03 00:26:30,781 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2015-11-03 00:26:30,782 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2015-11-03 00:26:33,319 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: NoSuchObjectException(message:default.ufo_session_details table not found)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1380)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
at com.sun.proxy.$Proxy11.get_table(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:836)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at com.sun.proxy.$Proxy12.getTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:945)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:887)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1083)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8680)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:261)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:238)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:491)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
1677 [main] INFO org.apache.hadoop.hive.ql.Driver -
1679 [main] INFO org.apache.hadoop.hive.ql.Driver -
1680 [main] INFO org.apache.hadoop.hive.ql.Driver -
1771 [main] INFO hive.ql.parse.ParseDriver - Parsing command: select * from ufo_session_master limit 5
2512 [main] INFO hive.ql.parse.ParseDriver - Parse Completed
2683 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - Starting Semantic Analysis
2686 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - Completed phase 1 of Semantic Analysis
2686 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - Get metadata for source tables
2831 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://zltv5636.vci.att.com:9083
2952 [main] WARN hive.metastore - Failed to connect to the MetaStore Server...
2952 [main] INFO hive.metastore - Waiting 1 seconds before next connection attempt.
3952 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://zltv5636.vci.att.com:9083
3959 [main] WARN hive.metastore - Failed to connect to the MetaStore Server...
3960 [main] INFO hive.metastore - Waiting 1 seconds before next connection attempt.
4960 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://zltv5636.vci.att.com:9083
4967 [main] WARN hive.metastore - Failed to connect to the MetaStore Server...
4967 [main] INFO hive.metastore - Waiting 1 seconds before next connection attempt.
5978 [main] ERROR org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table ufo_session_master

ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint

I'm using Hadoop 2.2.0 in a Cluster setup and I repeatedly get the following error, the Exception is produced in the name node olympus under file /opt/dev/hadoop/2.2.0/logs/hadoop-deploy-secondarynamenode-olympus.log e.g.
2014-02-12 16:19:59,013 INFO org.mortbay.log: Started SelectChannelConnector#olympus:50090
2014-02-12 16:19:59,013 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Web server init done
2014-02-12 16:19:59,013 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Secondary Web-server up at: olympus:50090
2014-02-12 16:19:59,013 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint Period :3600 secs (60 min)
2014-02-12 16:19:59,013 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Log Size Trigger :1000000 txns
2014-02-12 16:20:59,161 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
LV = -47 namespaceID = 291272852 cTime = 0 ; clusterId = CID-e3e4ac32-7384-4a1f-9dce-882a6e2f4bd4 ; blockpoolId = BP-166254569-192.168.92.21-1392217748925.
Expecting respectively: -47; 431978717; 0; CID-85b65e19-4030-445b-af8e-5933e75a6e5a; BP-1963497814-192.168.92.21-1392217083597.
at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:133)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:519)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:380)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:346)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:342)
at java.lang.Thread.run(Thread.java:744)
2014-02-12 16:21:59,183 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
LV = -47 namespaceID = 291272852 cTime = 0 ; clusterId = CID-e3e4ac32-7384-4a1f-9dce-882a6e2f4bd4 ; blockpoolId = BP-166254569-192.168.92.21-1392217748925.
Expecting respectively: -47; 431978717; 0; CID-85b65e19-4030-445b-af8e-5933e75a6e5a; BP-1963497814-192.168.92.21-1392217083597.
at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:133)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:519)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:380)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:346)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:342)
at java.lang.Thread.run(Thread.java:744)
Can anyone advice what's wrong here?
I had the same error and it went when I deleted the [hadoop temporary directory] /dfs/namesecondary directory.
For me [hadoop temporary directory] is the value of hadoop.tmp.dir in core-site.xml
We need to stop the hadoop services first , and then delete the tmp secondary namenode directory (hadoop.tmp.dir will tell the path for secondary namenode data directory). After this, start the services again and the issue will be fixed.

Datanode starts but not namenode

After a bit of struggling I had eventually managed to use hadoop in pseudo-distributed node, with a namenode and a jobtracker working perfectly (at http://localhost:50070 and http://localhost:50030)
Yesterday I tried to restart my namenode, datanode, etc with:
$hadoop namenode -format
$start-all.sh
And jps gives me the following output:
17148 DataNode
17295 SecondaryNameNode
17419 JobTracker
17669 Jps
Namenode doesn't seem to be willing to start anymore ... And Jobtracker dies a few seconds later.
Mark that I hadn't restarted my computer and that I've tried the solution given in the following thread Namenode not getting started but it didn't help.
Here is the log of the namenode, with a bunch of errors. I don't know how to solve my issue at all
2013-08-16 09:02:21,647 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost.lan/192.168.1.94
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_25
************************************************************/
2013-08-16 09:02:21,839 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-08-16 09:02:21,868 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-08-16 09:02:21,871 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-08-16 09:02:21,871 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2013-08-16 09:02:22,098 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-08-16 09:02:22,103 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-08-16 09:02:22,110 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2013-08-16 09:02:22,111 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered.
2013-08-16 09:02:22,140 INFO org.apache.hadoop.hdfs.util.GSet: Computing capacity for map BlocksMap
2013-08-16 09:02:22,140 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 64-bit
2013-08-16 09:02:22,140 INFO org.apache.hadoop.hdfs.util.GSet: 2.0% max memory = 932118528
2013-08-16 09:02:22,140 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^21 = 2097152 entries
2013-08-16 09:02:22,140 INFO org.apache.hadoop.hdfs.util.GSet: recommended=2097152, actual=2097152
2013-08-16 09:02:22,174 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=rlk
2013-08-16 09:02:22,174 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2013-08-16 09:02:22,174 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2013-08-16 09:02:22,189 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100
2013-08-16 09:02:22,189 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2013-08-16 09:02:22,271 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
2013-08-16 09:02:22,320 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
2013-08-16 09:02:22,321 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2013-08-16 09:02:22,363 INFO org.apache.hadoop.hdfs.server.common.Storage: Start loading image file /home/rlk/hduser/dfs/name/current/fsimage
2013-08-16 09:02:22,364 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1
2013-08-16 09:02:22,372 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0
2013-08-16 09:02:22,375 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file /home/rlk/hduser/dfs/name/current/fsimage of size 109 bytes loaded in 0 seconds.
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Start loading edits file /home/rlk/hduser/dfs/name/current/edits
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: EOF of /home/rlk/hduser/dfs/name/current/edits, reached end of edit log Number of transactions found: 0. Bytes read: 4
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Start checking end of edit log (/home/rlk/hduser/dfs/name/current/edits) ...
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Checked the bytes after the end of edit log (/home/rlk/hduser/dfs/name/current/edits):
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Padding position = -1 (-1 means padding not found)
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Edit log length = 4
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Read length = 4
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Corruption length = 0
2013-08-16 09:02:22,376 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Toleration length = 0 (= dfs.namenode.edits.toleration.length)
2013-08-16 09:02:22,382 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Summary: |---------- Read=4 ----------|-- Corrupt=0 --|-- Pad=0 --|
2013-08-16 09:02:22,382 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Edits file /home/rlk/hduser/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds.
2013-08-16 09:02:22,387 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file /home/rlk/hduser/dfs/name/current/fsimage of size 109 bytes saved in 0 seconds.
2013-08-16 09:02:22,553 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: closing edit log: position=4, editlog=/home/rlk/hduser/dfs/name/current/edits
2013-08-16 09:02:22,553 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: close success: truncate to 4, editlog=/home/rlk/hduser/dfs/name/current/edits
2013-08-16 09:02:22,933 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups
2013-08-16 09:02:22,933 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 776 msecs
2013-08-16 09:02:22,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.safemode.threshold.pct = 0.9990000128746033
2013-08-16 09:02:22,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
2013-08-16 09:02:22,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.safemode.extension = 30000
2013-08-16 09:02:22,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of blocks excluded by safe block count: 0 total blocks: 0 and thus the safe blocks: 0
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of blocks = 0
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid blocks = 0
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of under-replicated blocks = 0
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of over-replicated blocks = 0
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode termination scan for invalid, over- and under-replicated blocks completed in 21 msec
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0 secs
2013-08-16 09:02:22,956 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
2013-08-16 09:02:22,962 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
2013-08-16 09:02:22,972 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list
2013-08-16 09:02:22,974 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 1 msec
2013-08-16 09:02:22,974 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 1 msec processing time, 1 msec clock time, 1 cycles
2013-08-16 09:02:22,974 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: First cycle completed 0 blocks in 0 msec
2013-08-16 09:02:22,974 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: InvalidateQueue QueueProcessingStatistics: Queue flush completed 0 blocks in 0 msec processing time, 0 msec clock time, 1 cycles
2013-08-16 09:02:22,983 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source FSNamesystemMetrics registered.
2013-08-16 09:02:23,026 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
2013-08-16 09:02:23,029 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort8020 registered.
2013-08-16 09:02:23,030 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort8020 registered.
2013-08-16 09:02:23,037 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost.localdomain/127.0.0.1:8020
2013-08-16 09:02:23,195 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-08-16 09:02:23,306 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2013-08-16 09:02:23,318 INFO org.apache.hadoop.http.HttpServer: dfs.webhdfs.enabled = false
2013-08-16 09:02:23,329 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
2013-08-16 09:02:23,331 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
2013-08-16 09:02:23,331 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070
2013-08-16 09:02:23,331 INFO org.mortbay.log: jetty-6.1.26
2013-08-16 09:02:23,386 INFO org.mortbay.log: Extract jar:file:/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/jre/lib/ext/hadoop-core-1.2.1.jar!/webapps/hdfs to /tmp/Jetty_0_0_0_0_50070_hdfs____w2cu08/webapp
2013-08-16 09:02:25,171 WARN org.mortbay.log: failed jsp: java.lang.NoClassDefFoundError: javax/servlet/jsp/JspFactory
2013-08-16 09:02:25,215 WARN org.mortbay.log: failed org.mortbay.jetty.webapp.WebAppContext#12305d34{/,jar:file:/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/jre/lib/ext/hadoop-core-1.2.1.jar!/webapps/hdfs}: java.lang.NoClassDefFoundError: javax/servlet/jsp/JspFactory
2013-08-16 09:02:25,225 WARN org.mortbay.log: failed ContextHandlerCollection#25370a40: java.lang.NoClassDefFoundError: javax/servlet/jsp/JspFactory
2013-08-16 09:02:25,226 ERROR org.mortbay.log: Error starting handlers
java.lang.NoClassDefFoundError: javax/servlet/jsp/JspFactory
at org.apache.jasper.servlet.JspServlet.init(JspServlet.java:99)
at org.mortbay.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:440)
at org.mortbay.jetty.servlet.ServletHolder.doStart(ServletHolder.java:263)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:736)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at org.mortbay.jetty.Server.doStart(Server.java:224)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.apache.hadoop.http.HttpServer.start(HttpServer.java:638)
at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:517)
at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:395)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:337)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
Caused by: java.lang.ClassNotFoundException: javax.servlet.jsp.JspFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 27 more
2013-08-16 09:02:25,307 INFO org.mortbay.log: Started SelectChannelConnector#0.0.0.0:50070
2013-08-16 09:02:25,307 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:rlk cause:java.io.IOException: Problem in starting http server. Server handlers failed
2013-08-16 09:02:25,308 INFO org.mortbay.log: Stopped SelectChannelConnector#0.0.0.0:50070
2013-08-16 09:02:25,308 ERROR org.mortbay.log: EXCEPTION
java.lang.NullPointerException
at org.apache.jasper.servlet.JspServlet.destroy(JspServlet.java:282)
at org.mortbay.jetty.servlet.ServletHolder.destroyInstance(ServletHolder.java:318)
at org.mortbay.jetty.servlet.ServletHolder.doStop(ServletHolder.java:289)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.mortbay.jetty.servlet.ServletHandler.doStop(ServletHandler.java:185)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
at org.mortbay.jetty.servlet.SessionHandler.doStop(SessionHandler.java:125)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
at org.mortbay.jetty.handler.ContextHandler.doStop(ContextHandler.java:592)
at org.mortbay.jetty.webapp.WebAppContext.doStop(WebAppContext.java:537)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.mortbay.jetty.handler.HandlerCollection.doStop(HandlerCollection.java:169)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.mortbay.jetty.handler.HandlerWrapper.doStop(HandlerWrapper.java:142)
at org.mortbay.jetty.Server.doStop(Server.java:283)
at org.mortbay.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:76)
at org.apache.hadoop.http.HttpServer.stop(HttpServer.java:688)
at org.apache.hadoop.hdfs.server.namenode.NameNode.stop(NameNode.java:604)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:571)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2013-08-16 09:02:25,336 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received InterruptedExceptionjava.lang.InterruptedException: sleep interrupted
2013-08-16 09:02:25,337 INFO org.apache.hadoop.hdfs.server.namenode.DecommissionManager: Interrupted Monitor
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager.java:65)
at java.lang.Thread.run(Thread.java:724)
2013-08-16 09:02:25,339 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 0 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
2013-08-16 09:02:25,375 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: closing edit log: position=4, editlog=/home/rlk/hduser/dfs/name/current/edits
2013-08-16 09:02:25,375 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: close success: truncate to 4, editlog=/home/rlk/hduser/dfs/name/current/edits
2013-08-16 09:02:25,403 INFO org.apache.hadoop.ipc.Server: Stopping server on 8020
2013-08-16 09:02:25,411 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
2013-08-16 09:02:25,412 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: Problem in starting http server. Server handlers failed
at org.apache.hadoop.http.HttpServer.start(HttpServer.java:662)
at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:517)
at org.apache.hadoop.hdfs.server.namenode.NameNode$1.run(NameNode.java:395)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:337)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2013-08-16 09:02:25,413 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.lan/192.168.1.94
************************************************************/
I also give you my hadoop configuration (I'm using hadoop-1.2.1) :
core-site.xml :
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- core-site.xml -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/rlk/hduser</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
hdfs-site.xml :
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- hdfs-site.xml -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml :
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- mapred-site.xml -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>
I found the solution : it was some jar collisions. I had duplicate jar files both in hadoop-x.y.z/ and hadoop-x.y.z/lib and in path-to-java/jre/lib/ext/.
I just removed the formers and everything works fine again.
you did not mention port number for master node in core-site.xml.
<property>
<name>fs.default.name</name>
<value>hdfs://master:port</value>
</property>
this problem in core-site.xml please set proper
<property>
<name>hadoop.tmp.dir</name>
<value>/home/rlk/hduser</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/90000</value>
</property>

Resources