error in reducer function of hadoop multi node cluster - hadoop

i follows the link here
when i run command of step 8 of the above tutorial:-
hduser#ila:/usr/local/hadoop-0.22.0$ ./bin/hadoop jar hadoop-mapred-examples-0.22.0.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg6-out
it runs map function correctly but not the reduce function, gives error as follos:-
12/04/24 02:06:56 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
12/04/24 02:06:56 INFO input.FileInputFormat: Total input paths to process : 3
12/04/24 02:06:56 INFO mapreduce.JobSubmitter: number of splits:3
12/04/24 02:06:56 INFO mapreduce.Job: Running job: job_201204232307_0012
12/04/24 02:06:57 INFO mapreduce.Job: map 0% reduce 0%
12/04/24 02:07:06 INFO mapreduce.Job: map 33% reduce 0%
12/04/24 02:07:09 INFO mapreduce.Job: map 100% reduce 0%
12/04/24 02:07:15 INFO mapreduce.Job: map 100% reduce 11%
12/04/24 02:08:14 INFO mapreduce.Job: Task Id : attempt_201204232307_0012_r_000000_0, Status : FAILED
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
at org.apache.hadoop.mapred.Child$4.run(Child.java:223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153)
at org.apache.hadoop.mapred.Child.main(Child.java:217)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:227)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
how can i overcome from this problem..
#chris white i faced a new problem that gives a new error when i run the above command
hduser#vijay-P5E-VM-DO:/usr/local/hadoop-1.0.0$ ./bin/hadoop jar hadoop-examples-1.0.0.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg9-out
12/05/14 20:42:28 INFO mapred.JobClient: Cleaning up the staging area hdfs://master:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201205142041_0001
12/05/14 20:42:28 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:54310/user/hduser/gutenberg
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:54310/user/hduser/gutenberg
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:465)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:495)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Related

sqoop job failing due to following reason

java.lang.Exception: java.io.IOException: Mkdirs failed to create file:/user/City/_temporary/0/_temporary/attempt_local1259965155_0001_m_000000_0 (exists=false, cwd=file:/home/centos)
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:489)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:549)
Caused by: java.io.IOException: Mkdirs failed to create file:/user/City/_temporary/0/_temporary/attempt_local1259965155_0001_m_000000_0 (exists=false, cwd=file:/home/centos)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:447)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:926)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:907)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804)
at org.apache.sqoop.mapreduce.RawKeyTextOutputFormat.getRecordWriter(RawKeyTextOutputFormat.java:98)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:653)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
18/12/05 13:14:05 INFO mapreduce.Job: Job job_local1259965155_0001 running in uber mode : false
18/12/05 13:14:05 INFO mapreduce.Job: map 0% reduce 0%
18/12/05 13:14:05 INFO mapreduce.Job: Job job_local1259965155_0001 failed with state FAILED due to: NA
18/12/05 13:14:05 INFO mapreduce.Job: Counters: 0
18/12/05 13:14:05 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/12/05 13:14:05 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 2.6049 seconds (0 bytes/sec)
18/12/05 13:14:05 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/12/05 13:14:05 INFO mapreduce.ImportJobBase: Retrieved 0 records.
18/12/05 13:14:05 ERROR tool.ImportAllTablesTool: Error during import: Import job failed!

Execute hive query cause yarn resource manager to throw file does not exist exception

I'm configuring hive 3.1.0 to work with hadoop 3.0.0.
This error throw almost immediately when I submit a simple query on beeline that cause map reduce
0: jdbc:hive2://> select count(*) from airlinedata;
18/10/11 10:24:45 [HiveServer2-Background-Pool: Thread-124]: WARN ql.Driver: Hive-on-MR is deprecated in Hive 2 and may not be available in the futureversions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = UUT81HC_20181011102444_2df01ff5-ca05-403c-b0e1-15f8f7715dc7
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
2018-10-11 10:24:45,510 INFO [HiveServer2-Background-Pool: Thread-124] client.RMProxy (RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at /10.184.153.232:8032
2018-10-11 10:24:45,555 INFO [HiveServer2-Background-Pool: Thread-124] client.RMProxy (RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at /10.184.153.232:8032
18/10/11 10:24:45 [HiveServer2-Background-Pool: Thread-124]: WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
WARN : Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73)
at org.apache.hadoop.mapreduce.TypeConverter.toYarn(TypeConverter.java:78)
at org.apache.hadoop.mapred.ClientServiceDelegate.(ClientServiceDelegate.java:120)
at org.apache.hadoop.mapred.ClientCache.getClient(ClientCache.java:68)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:343)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70)
... 40 more
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual
Reason:
Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
Current Frame:
bci: #36
flags: { }
locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
Bytecode:
0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7
0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a
0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5
0x0000030: 0013 2ab0
Stackmap Table:
same_frame(#19)
same_frame(#31)
same_frame(#40)
at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017)
at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.(JobIdPBImpl.java:37)
... 45 more
yarn resoucemanager stacktrace
2018-10-11 10:24:49,896 INFO rmapp.RMAppImpl: application_1539226955170_0002 State change from ACCEPTED to FINAL_SAVING on event = ATTEMPT_FAILED
2018-10-11 10:24:49,896 INFO recovery.RMStateStore: Updating info for app: application_1539226955170_0002
2018-10-11 10:24:49,897 INFO capacity.CapacityScheduler: Application Attempt appattempt_1539226955170_0002_000002 is done. finalState=FAILED
2018-10-11 10:24:49,897 INFO rmapp.RMAppImpl: Application application_1539226955170_0002 failed 2 times due to AM Container for appattempt_1539226955170_0002_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2018-10-11 10:24:49.876]File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1488)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1503)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002 Then click on links to logs of each attempt.
. Failing the application.
2018-10-11 10:24:49,897 INFO scheduler.AppSchedulingInfo: Application application_1539226955170_0002 requests cleared
2018-10-11 10:24:49,897 INFO rmapp.RMAppImpl: application_1539226955170_0002 State change from FINAL_SAVING to FAILED on event = APP_UPDATE_SAVED
2018-10-11 10:24:49,898 INFO capacity.LeafQueue: Application removed - appId: application_1539226955170_0002 user: UUT81HC queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2018-10-11 10:24:49,898 WARN resourcemanager.RMAuditLogger: USER=UUT81HC OPERATION=Application Finished - Failed
TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1539226955170_0002 failed 2 times due to AM Container for appattempt_1539226955170_0002_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2018-10-11 10:24:49.876]File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1488)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1503)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002 Then click on links to logs of each attempt.
. Failing the application. APPID=application_1539226955170_0002
2018-10-11 10:24:49,898 INFO capacity.ParentQueue: Application removed - appId: application_1539226955170_0002 user: UUT81HC leaf-queue of parent: root #applications: 0
2018-10-11 10:24:49,899 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1539226955170_0002,name=select count(*) from airlinedata (Stage-1),user=UUT81HC,queue=default,state=FAILED,trackingUrl=http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002,appMasterHost=N/A,submitTime=1539228287412,startTime=1539228287413,finishTime=1539228289896,finalStatus=FAILED,memorySeconds=1482,vcoreSeconds=0,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=,applicationType=MAPREDUCE,resourceSeconds=1482 MB-seconds\, 0 vcore-seconds,preemptedResourceSeconds=0 MB-seconds\, 0 vcore-seconds
After examine how hive execute mapreduce job on yarn, I found that it first it create map.xml and reduce.xml in /tmp with permission drwx------ (only owner can use it)
2018-10-11 10:24:45,133 INFO hdfs.StateChange: BLOCK* allocate blk_1073742318_1495, replicas=10.184.153.232:9866 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml
2018-10-11 10:24:45,225 INFO hdfs.StateChange: DIR* completeFile: /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:45,248 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml
2018-10-11 10:24:45,294 INFO hdfs.StateChange: BLOCK* allocate blk_1073742319_1496, replicas=10.184.153.232:9866 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
2018-10-11 10:24:45,411 INFO hdfs.StateChange: DIR* completeFile: /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:45,437 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
2018-10-11 10:24:45,772 INFO hdfs.StateChange: BLOCK* allocate blk_1073742320_1497, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar
2018-10-11 10:24:46,438 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,463 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar
2018-10-11 10:24:46,618 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split
2018-10-11 10:24:46,639 INFO hdfs.StateChange: BLOCK* allocate blk_1073742321_1498, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split
2018-10-11 10:24:46,706 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,791 INFO hdfs.StateChange: BLOCK* allocate blk_1073742322_1499, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.splitmetainfo
2018-10-11 10:24:46,870 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.splitmetainfo is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,971 INFO hdfs.StateChange: BLOCK* allocate blk_1073742323_1500, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.xml
2018-10-11 10:24:47,370 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:32:15,741 INFO blockmanagement.BlockManager: StorageInfo TreeSet fill ratio DS-d4c2a5a0-435d-4b44-b408-3cd04587cd09 : 1.0
But somehow yarn can't read that when executing job and throw out file does not exist. I did set permission 777 on /tmp but this file is self create by hive in executing process so I can't do anything with it.
I doubt that this problem is something related to user or permission when using hive in hadoop. What should I do with this?

Hive Insert - Failed with exception Unable to alter table. java.lang.NullPointerException

We are using Cloudera 5.6. We have configured Sentry for Hive. Whenever we issue an insert statment, it fails with the below exception. But when we check the table, the row is inserted properly. We have set all the permissions to hive.
$$$ insert into beckman values('rinku');
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
INFO : number of splits:1
INFO : Submitting tokens for job: job_1459405260708_0002
INFO : The url to track the job: http://xxx.xxx.xxx.xxx:8088/proxy/application_1459405260708_0002/
INFO : Starting Job = job_1459405260708_0002, Tracking URL = http://xxx.xxx.xxx.xxx:8088/proxy/application_1459405260708_0002/
INFO : Kill Command = /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/bin/hadoop job -kill job_1459405260708_0002
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
INFO : 2016-03-31 23:20:31,401 Stage-1 map = 0%, reduce = 0%
INFO : 2016-03-31 23:20:37,788 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.4 sec
INFO : MapReduce Total cumulative CPU time: 1 seconds 400 msec
INFO : Ended Job = job_1459405260708_0002
INFO : Stage-4 is selected by condition resolver.
INFO : Stage-3 is filtered out by condition resolver.
INFO : Stage-5 is filtered out by condition resolver.
INFO : Moving data to: hdfs://ip-172-31-0-203.us-west-2.compute.internal:8020/user/hive2/warehouse/test.db/beckman/.hive-staging_hive_2016-03-31_23-20-23_107_7263355827488393299-2/-ext-10000 from hdfs://ip-172-31-0-203.us-west-2.compute.internal:8020/user/hive2/warehouse/test.db/beckman/.hive-staging_hive_2016-03-31_23-20-23_107_7263355827488393299-2/-ext-10002
INFO : Loading data to table test.beckman from hdfs://ip-172-31-0-203.us-west-2.compute.internal:8020/user/hive2/warehouse/test.db/beckman/.hive-staging_hive_2016-03-31_23-20-23_107_7263355827488393299-2/-ext-10000
ERROR : Failed with exception Unable to alter table. java.lang.NullPointerException
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:533)
at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:519)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1685)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:312)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1645)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1404)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1190)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1055)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1050)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:143)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:207)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: MetaException(message:java.lang.NullPointerException)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_with_cascade_result$alter_table_with_cascade_resultStandardScheme.read(ThriftHiveMetastore.java:42087)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_with_cascade_result$alter_table_with_cascade_resultStandardScheme.read(ThriftHiveMetastore.java:42064)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_with_cascade_result.read(ThriftHiveMetastore.java:42006)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_with_cascade(ThriftHiveMetastore.java:1402)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_with_cascade(ThriftHiveMetastore.java:1386)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:340)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:91)
at com.sun.proxy.$Proxy10.alter_table(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:1998)
at com.sun.proxy.$Proxy10.alter_table(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:531)
... 22 more
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask (state=08S01,code=1)
I have noticed this issue when you try altering table with format/syntax along with database name. I.e something like : "<DATABASE>.<TABLENAME>".. Somehow this looks like an issue.
In order to overcome it, explicitly mention the database before you execute the actual query. A sample illustration :
use sampledb;
alter employeeTable .... ; // This works
vs
[Do not use the below]
alter sampledb.employeeTable.... // This doesn't work

Can't create an external table in hive to point a hbase one

I am a student trying to understand how work the all hadoop thing. So, I'm running cloudera on 15 machines. The configuration is fine, all services are green.
I imported a mysql 12k lines under hbase and all went fine too.
I wanted to do queries on these data but I know that I can't with hbase. That's why I want to create an external view with the following code:
CREATE EXTERNAL TABLE ViewSimulation2 (
id int,
eol int,
sensor int,
value1 float,
value2 float,
value3 float,
value4 float,
value5 float,
value6 float)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" =
":key,data:eol,data:sensor,data:value1,data:value2,data:value3,data:value4,data:value5,data:value6"
)
TBLPROPERTIES("hbase.table.name" = "Simulation");
When I run it in the console it freezes and I have to ctrl-c to cancel it. In hue, all I have is these messages looping again and again:
13/12/04 07:33:48 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1386171228376 end=1386171228483 duration=107>
13/12/04 07:33:48 INFO exec.DDLTask: Use StorageHandler-supplied org.apache.hadoop.hive.hbase.HBaseSerDe for table ViewSimulation2
13/12/04 07:33:48 INFO hive.metastore: Trying to connect to metastore with URI thrift://insset-5.cloudera.insset.fr:9083
13/12/04 07:33:48 INFO hive.metastore: Waiting 1 seconds before next connection attempt.
13/12/04 07:33:49 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
13/12/04 07:33:49 WARN conf.Configuration: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
13/12/04 07:33:49 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
13/12/04 07:33:49 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
13/12/04 07:33:49 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 16770#insset-8.cloudera.insset.fr
13/12/04 07:33:49 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
13/12/04 07:33:49 INFO util.RetryCounter: Sleeping 2000ms before retry #1...
13/12/04 07:33:51 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
13/12/04 07:33:51 INFO util.RetryCounter: Sleeping 4000ms before retry #2...
13/12/04 07:33:56 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
13/12/04 07:33:56 INFO util.RetryCounter: Sleeping 8000ms before retry #3...
13/12/04 07:34:05 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
13/12/04 07:34:05 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
13/12/04 07:34:05 WARN zookeeper.ZKUtil: hconnection Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)
at org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
at org.apache.hadoop.hbase.zookeeper.ClusterId.getId(ClusterId.java:50)
at org.apache.hadoop.hbase.zookeeper.ClusterId.hasId(ClusterId.java:44)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:615)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:684)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:126)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:73)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:147)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:428)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at $Proxy13.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:576)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3719)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:254)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:344)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:609)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:598)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:337)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1388)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:598)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
13/12/04 07:34:05 ERROR zookeeper.ZooKeeperWatcher: hconnection Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)
at org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
at org.apache.hadoop.hbase.zookeeper.ClusterId.getId(ClusterId.java:50)
at org.apache.hadoop.hbase.zookeeper.ClusterId.hasId(ClusterId.java:44)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.ensureZookeeperTrackers(HConnectionManager.java:615)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:684)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:126)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:73)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:147)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:428)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at $Proxy13.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:576)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3719)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:254)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:344)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:609)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:598)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:337)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1388)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:598)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
After that, I have other errors like this one :
13/12/04 07:49:53 INFO client.HConnectionManager$HConnectionImplementation: ZooKeeper available but no active master location found
13/12/04 07:49:53 INFO client.HConnectionManager$HConnectionImplementation: getMaster attempt 8 of 10 failed; retrying after sleep of 16069
org.apache.hadoop.hbase.MasterNotRunningException
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:706)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:126)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:73)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:147)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:428)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at $Proxy13.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:576)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3719)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:254)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute
Which is weird because as I said, all is running fine, so as the hbase master...
Would it be a zookeeper bad configuration ? I have made many many searches and I didn't found anything helping me.
It seems like ZooKeeper is not setup correctly. Maybe the dataDir does not exist: http://archive.cloudera.com/cdh4/cdh/4/zookeeper/zookeeperAdmin.html#sc_zkMulitServerSetup

hadoop + Writable interface + readFields throws an exception in reducer

I have a simple map-reduce program in which my map and reduce primitives look like this
map(K,V) = (Text, OutputAggregator)
reduce(Text, OutputAggregator) = (Text,Text)
The important point is that from my map function I emit an object of type OutputAggregator which is my own class that implements the Writable interface. However, my reduce fails with the following exception. More specifically, the readFieds() function is throwing an exception. Any clue why ? I use hadoop 0.18.3
10/09/19 04:04:59 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
10/09/19 04:04:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1
10/09/19 04:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1
10/09/19 04:04:59 INFO mapred.JobClient: Running job: job_local_0001
10/09/19 04:04:59 INFO mapred.MapTask: numReduceTasks: 1
10/09/19 04:04:59 INFO mapred.MapTask: io.sort.mb = 100
10/09/19 04:04:59 INFO mapred.MapTask: data buffer = 79691776/99614720
10/09/19 04:04:59 INFO mapred.MapTask: record buffer = 262144/327680
Length = 10
10
10/09/19 04:04:59 INFO mapred.MapTask: Starting flush of map output
10/09/19 04:04:59 INFO mapred.MapTask: bufstart = 0; bufend = 231; bufvoid = 99614720
10/09/19 04:04:59 INFO mapred.MapTask: kvstart = 0; kvend = 10; length = 327680
gl_books
10/09/19 04:04:59 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NullPointerException
at org.myorg.OutputAggregator.readFields(OutputAggregator.java:46)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:751)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:691)
at org.apache.hadoop.mapred.Task$CombineValuesIterator.next(Task.java:770)
at org.myorg.xxxParallelizer$Reduce.reduce(xxxParallelizer.java:117)
at org.myorg.xxxParallelizer$Reduce.reduce(xxxParallelizer.java:1)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:904)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:785)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:157)
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113)
at org.myorg.xxxParallelizer.main(xxxParallelizer.java:145)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
When posting a question about custom code: Post the relevant piece of code. So the content of line 46 and a few lines before & after would really help ...:)
However this may help:
THE pitfall when writing your own Writable Class is the fact that Hadoop reuses the actual instance of the class over and over again. Between calls to readFields you do NOT get a shiny new instance.
So at the start of the readFields method you MUST assume the object you are in is filled with "garbage" and must be cleared before continuing.
My suggestion to you is to implement a "clear()" method that fully wipes the current instance and resets it to the state it would be in the moment after it was created and the constructor completed. And of course you call that method as the first thing in your readFields for both the key and the value.
HTH
In addition to Niels Basjes answer: Just initialize your member variables within the empty constructor (which you have to supply, otherwise Hadoop can not init your object), e.g.:
public OutputAggregator() {
this.member = new IntWritable();
...
}
assuming that this.member is of type IntWritable.

Resources