When I use flink to write hbase,getting hbase error in region server - hadoop

Soft version as follows:
apache hbase 2.1.6
apache flink 1.13.6
apache hadoop 3.1.1
When I use the hbase-client api to access hbase, I get the following error:
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions:
Wed Sep 28 03:03:11 UTC 2022, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68532: java.io.IOException: Invalid currTagsLen -32239. Block offset: 1319713, block length: 99991, position: 42422 (without header). path=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/cd083a4a1ef04baff94ebb5aabdb8cb8/i/1f6dd8a1bc054eefbc9faa1bf625e24f
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:472)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: java.lang.IllegalStateException: Invalid currTagsLen -32239. Block offset: 1319713, block length: 99991, position: 42422 (without header). path=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/cd083a4a1ef04baff94ebb5aabdb8cb8/i/1f6dd8a1bc054eefbc9faa1bf625e24f
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.checkTagsLen(HFileReaderImpl.java:642)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readKeyValueLen(HFileReaderImpl.java:630)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl._next(HFileReaderImpl.java:1080)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.next(HFileReaderImpl.java:1097)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:208)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:120)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:653)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6581)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6745)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6518)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3155)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3404)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42190)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
... 3 more
The exception for hbase regionserver is as follows:
2022-09-28 11:19:36,019 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system started
2022-09-28 11:20:20,946 INFO [MemStoreFlusher.0] regionserver.HRegion: Flushing 1/1 column families, dataSize=1.95 MB heapSize=2.09 MB
2022-09-28 11:20:20,969 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed memstore data size=1.95 MB at sequenceid=8934625 (bloomFilter=true), to=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d5468
55f/.tmp/i/2629dbae7d5e402489ef56b1c097289f
2022-09-28 11:20:20,977 INFO [MemStoreFlusher.0] regionserver.HStore: Added hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/2629dbae7d5e402489ef56b1c097289f, entries=1212, sequenceid=8934625, filesize=359.
1 K
2022-09-28 11:20:20,978 INFO [MemStoreFlusher.0] regionserver.HRegion: Finished flush of dataSize ~1.95 MB/2041026, heapSize ~2.09 MB/2190200, currentSize=0 B/0 for e63ee2269b0b076a415c5f76d546855f in 32ms, sequenceid=8934625, compaction requested=true
2022-09-28 11:20:20,986 INFO [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] regionserver.HRegion: Starting compaction of i in expose,9ffffff6,1663741391432.e63ee2269b0b076a415c5f76d546855f.
2022-09-28 11:20:20,986 INFO [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] regionserver.HStore: Starting compaction of [hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/98d0ecd1ed
7744a8a5f94923c382861e, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/30bab1682dba4721b25e58b78dd17255, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/f8
0c2f08176e417a9184f434d4300935, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/52baca576c154c26b7df3b5d126d47b8, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d5468
55f/i/7d8291d422d042de9aa43aa5b79da6ad, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/8bf3b47909ab4eeb86d8a5c283cfe942, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5
f76d546855f/i/0663d48a4ed94dbe9fdc78f6649c1eb3, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/b80b55d744174bc882db93283cd70c71] into tmpdir=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e
63ee2269b0b076a415c5f76d546855f/.tmp, totalSize=18.9 M
2022-09-28 11:20:21,153 INFO [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] throttle.PressureAwareThroughputController: e63ee2269b0b076a415c5f76d546855f#i#compaction#637 average throughput is 122.45 MB/second, slept 0 time(s) and
total slept time is 0 ms. 0 active operations remaining, total limit is 61.86 MB/second
2022-09-28 11:20:21,159 ERROR [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] regionserver.CompactSplit: Compaction failed region=expose,9ffffff6,1663741391432.e63ee2269b0b076a415c5f76d546855f., storeName=i, priority=73, startTime=
1664335220978
java.lang.IllegalStateException: Invalid currTagsLen -9. Block offset: 1677972, block length: 161891, position: 48652 (without header). path=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/b80b55d744174bc88
2db93283cd70c71
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.checkTagsLen(HFileReaderImpl.java:642)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readKeyValueLen(HFileReaderImpl.java:630)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl._next(HFileReaderImpl.java:1080)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.next(HFileReaderImpl.java:1097)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:208)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:120)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:653)
at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:388)
at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:327)
at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1410)
at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2187)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:596)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:638)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2022-09-28 11:20:25,000 INFO [RpcServer.default.FPBQ.Fifo.handler=18,queue=3,port=16020] regionserver.HRegion: writing data to region expose,9ffffff6,1663741391432.e63ee2269b0b076a415c5f76d546855f. with WAL disabled. Data may be lost in the event of a cra
sh.
2022-09-28 11:24:01,565 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=1.08 GB, freeSize=2.52 GB, max=3.60 GB, blockCount=17155, accesses=133155383, hits=132992986, hitRatio=99.88%, , cachingAccesses=132985682, cachingHits=132951576, cac
hingHitsRatio=99.97%, evictions=16199, evicted=0, evictedPerRun=0.0
2022-09-28 11:24:01,569 INFO [MobFileCache #0] mob.MobFileCache: MobFileCache Statistics, access: 0, miss: 0, hit: 0, hit ratio: 0%, evicted files: 0
2022-09-28 11:24:05,246 INFO [regionserver/bghbaseclusterdn9528:16020.logRoller] wal.AbstractFSWAL: Rolled WAL /apps/hbase/data/WALs/bghbaseclusterdn9528,16020,1664173440239/bghbaseclusterdn9528%2C16020%2C1664173440239.1664331845190 with entries=21, files
ize=5.39 KB; new WAL /apps/hbase/data/WALs/bghbaseclusterdn9528,16020,1664173440239/bghbaseclusterdn9528%2C16020%2C1664173440239.1664335445235
I found some solutions in code. such as HBASE-21507、HBASE-24515、HBASE-21775

Related

Hive on Hadoop, Map Reduce not working - Error: Could not find or load main class 1600

Need help in resolving below issue.
I have installed Ubuntu as Windows subsystem on Windows 10.
Installed Hadoop 3.1.3 and Hive 3.1.2
When I am running normal query without MapReduce its running fine.
hive> use bhudwh;
OK
Time taken: 1.075 seconds
hive> select id from matches where id < 5;
OK
1
2
3
4
Time taken: 6.012 seconds, Fetched: 4 row(s)
hive>
When running MapReduce query, it throws error - Error: Could not find or load main class 1600.
hive> select distinct id from matches;
Query ID = bhush_20200529144705_62bc4f10-1604-453f-a90c-ed905c9c1fe9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1590670326852_0003, Tracking URL = http://DESKTOP-EU9VK4S.localdomain:8088/proxy/application_1590670326852_0003/
Kill Command = /mnt/e/Study/Hadoop/hadoop-3.1.3/bin/mapred job -kill job_1590670326852_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2020-05-29 14:47:24,644 Stage-1 map = 0%, reduce = 0%
2020-05-29 14:47:41,549 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1590670326852_0003 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1590670326852_0003_m_000000 (and more) from job job_1590670326852_0003
Task with the most failures(4):
-----
Task ID:
task_1590670326852_0003_m_000000
URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1590670326852_0003&tipid=task_1590670326852_0003_m_000000
-----
Diagnostic Messages for this Task:
[2020-05-29 14:47:40.355]Exception from container-launch.
Container id: container_1590670326852_0003_01_000005
Exit code: 1
[2020-05-29 14:47:40.360]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
[2020-05-29 14:47:40.361]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
hive>
Below are few lines from Hadoop logs.
2020-05-29 14:47:28,262 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 1
2020-05-29 14:47:28,262 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1590670326852_0003_m_000000_0: [2020-05-29 14:47:27.559]Exception from container-launch.
Container id: container_1590670326852_0003_01_000002
Exit code: 1
[2020-05-29 14:47:27.565]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
[2020-05-29 14:47:27.566]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
I have tried all the configuration changes suggested in different threads but its not working.
I have also checked Hadoop MapReduce example of WordCount and it also fails with same error.
All Hadoop processes seems running fine. Output of jps command.
9473 NodeManager
11798 Jps
9096 ResourceManager
8554 DataNode
8331 NameNode
8827 SecondaryNameNode
Please suggest how to resolve this error.
It looks like the start command of the MapReduce task contains an illegal option 1600. You need to check whether exists an illegal configuration yarn.app.mapreduce.am.command-opts with a value of 1600 in your yarn-site.xml.

Execute hive query cause yarn resource manager to throw file does not exist exception

I'm configuring hive 3.1.0 to work with hadoop 3.0.0.
This error throw almost immediately when I submit a simple query on beeline that cause map reduce
0: jdbc:hive2://> select count(*) from airlinedata;
18/10/11 10:24:45 [HiveServer2-Background-Pool: Thread-124]: WARN ql.Driver: Hive-on-MR is deprecated in Hive 2 and may not be available in the futureversions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = UUT81HC_20181011102444_2df01ff5-ca05-403c-b0e1-15f8f7715dc7
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
2018-10-11 10:24:45,510 INFO [HiveServer2-Background-Pool: Thread-124] client.RMProxy (RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at /10.184.153.232:8032
2018-10-11 10:24:45,555 INFO [HiveServer2-Background-Pool: Thread-124] client.RMProxy (RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at /10.184.153.232:8032
18/10/11 10:24:45 [HiveServer2-Background-Pool: Thread-124]: WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
WARN : Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:73)
at org.apache.hadoop.mapreduce.TypeConverter.toYarn(TypeConverter.java:78)
at org.apache.hadoop.mapred.ClientServiceDelegate.(ClientServiceDelegate.java:120)
at org.apache.hadoop.mapred.ClientCache.getClient(ClientCache.java:68)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:343)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:70)
... 40 more
Caused by: java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder.setAppId(Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto;)Lorg/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder; #36: invokevirtual
Reason:
Type 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
Current Frame:
bci: #36
flags: { }
locals: { 'org/apache/hadoop/mapreduce/v2/proto/MRProtos$JobIdProto$Builder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationIdProto' }
Bytecode:
0x0000000: 2ab4 0011 c700 1b2b c700 0bbb 002f 59b7
0x0000010: 0030 bf2a 2bb5 000a 2ab6 0031 a700 0c2a
0x0000020: b400 112b b600 3257 2a59 b400 1304 80b5
0x0000030: 0013 2ab0
Stackmap Table:
same_frame(#19)
same_frame(#31)
same_frame(#40)
at org.apache.hadoop.mapreduce.v2.proto.MRProtos$JobIdProto.newBuilder(MRProtos.java:1017)
at org.apache.hadoop.mapreduce.v2.api.records.impl.pb.JobIdPBImpl.(JobIdPBImpl.java:37)
... 45 more
yarn resoucemanager stacktrace
2018-10-11 10:24:49,896 INFO rmapp.RMAppImpl: application_1539226955170_0002 State change from ACCEPTED to FINAL_SAVING on event = ATTEMPT_FAILED
2018-10-11 10:24:49,896 INFO recovery.RMStateStore: Updating info for app: application_1539226955170_0002
2018-10-11 10:24:49,897 INFO capacity.CapacityScheduler: Application Attempt appattempt_1539226955170_0002_000002 is done. finalState=FAILED
2018-10-11 10:24:49,897 INFO rmapp.RMAppImpl: Application application_1539226955170_0002 failed 2 times due to AM Container for appattempt_1539226955170_0002_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2018-10-11 10:24:49.876]File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1488)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1503)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002 Then click on links to logs of each attempt.
. Failing the application.
2018-10-11 10:24:49,897 INFO scheduler.AppSchedulingInfo: Application application_1539226955170_0002 requests cleared
2018-10-11 10:24:49,897 INFO rmapp.RMAppImpl: application_1539226955170_0002 State change from FINAL_SAVING to FAILED on event = APP_UPDATE_SAVED
2018-10-11 10:24:49,898 INFO capacity.LeafQueue: Application removed - appId: application_1539226955170_0002 user: UUT81HC queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2018-10-11 10:24:49,898 WARN resourcemanager.RMAuditLogger: USER=UUT81HC OPERATION=Application Finished - Failed
TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1539226955170_0002 failed 2 times due to AM Container for appattempt_1539226955170_0002_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: [2018-10-11 10:24:49.876]File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
java.io.FileNotFoundException: File does not exist: hdfs://10.184.153.232:19000/tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1488)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1503)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002 Then click on links to logs of each attempt.
. Failing the application. APPID=application_1539226955170_0002
2018-10-11 10:24:49,898 INFO capacity.ParentQueue: Application removed - appId: application_1539226955170_0002 user: UUT81HC leaf-queue of parent: root #applications: 0
2018-10-11 10:24:49,899 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1539226955170_0002,name=select count(*) from airlinedata (Stage-1),user=UUT81HC,queue=default,state=FAILED,trackingUrl=http://HC-UT40048C.apac.com:8088/cluster/app/application_1539226955170_0002,appMasterHost=N/A,submitTime=1539228287412,startTime=1539228287413,finishTime=1539228289896,finalStatus=FAILED,memorySeconds=1482,vcoreSeconds=0,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=,applicationType=MAPREDUCE,resourceSeconds=1482 MB-seconds\, 0 vcore-seconds,preemptedResourceSeconds=0 MB-seconds\, 0 vcore-seconds
After examine how hive execute mapreduce job on yarn, I found that it first it create map.xml and reduce.xml in /tmp with permission drwx------ (only owner can use it)
2018-10-11 10:24:45,133 INFO hdfs.StateChange: BLOCK* allocate blk_1073742318_1495, replicas=10.184.153.232:9866 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml
2018-10-11 10:24:45,225 INFO hdfs.StateChange: DIR* completeFile: /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:45,248 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/map.xml
2018-10-11 10:24:45,294 INFO hdfs.StateChange: BLOCK* allocate blk_1073742319_1496, replicas=10.184.153.232:9866 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
2018-10-11 10:24:45,411 INFO hdfs.StateChange: DIR* completeFile: /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:45,437 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hive/UUT81HC/0d321851-1d90-4f19-ac50-12d120da601d/hive_2018-10-11_10-24-44_868_5772391105026287697-3/-mr-10005/b8800c0f-f09c-41ca-ab69-a79b72fc9597/reduce.xml
2018-10-11 10:24:45,772 INFO hdfs.StateChange: BLOCK* allocate blk_1073742320_1497, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar
2018-10-11 10:24:46,438 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,463 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.jar
2018-10-11 10:24:46,618 INFO namenode.FSDirectory: Increasing replication from 2 to 10 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split
2018-10-11 10:24:46,639 INFO hdfs.StateChange: BLOCK* allocate blk_1073742321_1498, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split
2018-10-11 10:24:46,706 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.split is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,791 INFO hdfs.StateChange: BLOCK* allocate blk_1073742322_1499, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.splitmetainfo
2018-10-11 10:24:46,870 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.splitmetainfo is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:24:46,971 INFO hdfs.StateChange: BLOCK* allocate blk_1073742323_1500, replicas=10.184.153.232:9866 for /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.xml
2018-10-11 10:24:47,370 INFO hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/UUT81HC/.staging/job_1539226955170_0002/job.xml is closed by DFSClient_NONMAPREDUCE_164506931_1
2018-10-11 10:32:15,741 INFO blockmanagement.BlockManager: StorageInfo TreeSet fill ratio DS-d4c2a5a0-435d-4b44-b408-3cd04587cd09 : 1.0
But somehow yarn can't read that when executing job and throw out file does not exist. I did set permission 777 on /tmp but this file is self create by hive in executing process so I can't do anything with it.
I doubt that this problem is something related to user or permission when using hive in hadoop. What should I do with this?

sparkR Rstudio error

With sparkR in Rstudio Unable to read data in error.
How solve What can I
environment
R:version 3.3.1
RStudio:Version 0.99.902
sparkR:Version 1.6.1
mac:Version 10.11.6
code
SPARK_HOME <- "/usr/local/Cellar/apache-spark/1.6.1/libexec"
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
.libPaths(c(file.path(SPARK_HOME, "R", "lib"), .libPaths()))
library(SparkR)
sc <- sparkR.init(master="local[3]", sparkHome=SPARK_HOME,
sparkEnvir=list(spark.driver.maemory="6g",
sparkPackages="com.databricks:spark-csv_2.10:1.4.0"))
sqlContext <- sparkRSQL.init(sc)
WARN
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
code
df <- read.df(sqlContext, "iris.csv", source="com.databricks.spark.csv", inferSchema="true")
WARN
WARN : Your hostname, xxxx-no-MacBook-Pro.local resolves to a loopback/non-reachable address: fe80:0:0:0:701f:d8ff:fe34:fd1%8, but we couldn't find any external IP address!
ERROR
ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
WARN
16/07/20 14:00:44 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
ERROR
16/07/20 14:00:44 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
16/07/20 14:00:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/07/20 14:00:44 INFO TaskSchedulerImpl: Cancelling stage 0
16/07/20 14:00:44 INFO DAGScheduler: ResultStage 0 (first at CsvRelation.scala:267) failed in 60.099 s
16/07/20 14:00:44 INFO DAGScheduler: Job 0 failed: first at CsvRelation.scala:267, took 60.168711 s
16/07/20 14:00:44 ERROR RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed
invokeJava(isStatic = TRUE, className, methodName, ...) でエラー:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
Because a solved way isn't understood.
Please tell me.
Try this
Sys.setenv(SPARK_HOME="/usr/local/Cellar/apache-spark/1.6.1/libexec")
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R","lib")))
sc <- sparkR.init(master="local", sparkEnvir = list(spark.driver.memory="4g", spark.executor.memory="6g"))
sqlContext <- sparkRSQL.init(sc)
It works for me.

Spark HbaseRDD giving Exception

I am trying to read form Hbase using following code
JavaPairRDD<ImmutableBytesWritable, Result> pairRdd = ctx
.newAPIHadoopRDD(conf, TableInputFormat.class,
ImmutableBytesWritable.class,
org.apache.hadoop.hbase.client.Result.class).cache().cache();
System.out.println(pairRdd.count());
But getting exception
java.lang.IllegalStateException: unread block data
Find below code
SparkConf sparkConf = new SparkConf().setAppName("JavaSparkSQL");
sparkConf.set("spark.master","spark://192.168.50.247:7077");
/* String [] stjars={"/home/BreakDown/SparkDemo2/target/SparkDemo2-0.0.1-SNAPSHOT.jar"};
sparkConf.setJars(stjars);*/
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
JavaSQLContext sqlCtx = new JavaSQLContext(ctx);
Configuration conf= HBaseConfiguration.create();
;
conf.set("hbase.master","192.168.50.73:60000");
conf.set("hbase.zookeeper.quorum","192.168.50.73");
conf.set("hbase.zookeeper.property.clientPort","2181");
conf.set("zookeeper.session.timeout","6000");
conf.set("zookeeper.recovery.retry","1");
conf.set("hbase.mapreduce.inputtable","employee11");
Any pointer will be of great help
Spark version 1.1.1 hadoop 2
hadoop 2.2.0
Hbase 0.98.8-hadoop2
PFB Stack Trace
14/12/17 21:18:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/12/17 21:18:46 INFO AppClient$ClientActor: Connecting to master spark://192.168.50.247:7077...
14/12/17 21:18:46 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
14/12/17 21:18:46 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20141217211846-0035
14/12/17 21:18:47 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 192.168.50.253, ANY, 1256 bytes)
14/12/17 21:18:47 INFO BlockManagerMasterActor: Registering block manager 192.168.50.253:41717 with 265.4 MB RAM, BlockManagerId(0, 192.168.50.253, 41717, 0)
14/12/17 21:18:48 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.50.253): java.lang.IllegalStateException: unread block data
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2420)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1380)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:160)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:724)

Datastax Enterprise 3.2 hive timeout exception

I tried to run simple hive query through Datastax Enterprise, but it always fails with timeout (on small data set or even empty tables). I've got 4 nodes of m1.large on AWS (2x Cassandra & 2x Analytics). See below:
cqlsh:intracker> select count(*) from event_tracks_by_browser_date LIMIT 100000;
count
-------
15030
Then with hive:
hive> select * from event_tracks_by_browser_date where type_id=10;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201312292327_0003, Tracking URL = http://node3:50030/jobdetails.jsp?jobid=job_201312292327_0003
Kill Command = /usr/bin/dse hadoop job -Dmapred.job.tracker=10.234.9.204:8012 -kill job_201312292327_0003
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
2013-12-30 10:30:27,578 Stage-1 map = 0%, reduce = 0%
2013-12-30 10:31:27,890 Stage-1 map = 0%, reduce = 0%
2013-12-30 10:32:28,137 Stage-1 map = 0%, reduce = 0%
2013-12-30 10:33:12,344 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201312292327_0003 with errors
Error during job, obtaining debugging information...
Examining task ID: task_201312292327_0003_m_000003 (and more) from job job_201312292327_0003
Exception in thread "Thread-10" java.lang.RuntimeException: Error while reading from task log url
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:240)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:227)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://node2:50060/tasklog?taskid=attempt_201312292327_0003_m_000000_1&start=-8193
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626)
at java.net.URL.openStream(URL.java:1037)
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getStackTraces(TaskLogProcessor.java:192)
... 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 2 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
I checked system.log and it looks like sime kind of timeout appears.
INFO [IPC Server handler 6 on 8012] 2013-12-30 10:32:24,880 TaskInProgress.java (line 551) Error from attempt_201312292327_0003_m_000001_2: java.io.IOException: java.io.IOEx
ception: java.lang.RuntimeException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: java.io.IOException: java.lang.RuntimeException
at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:100)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:240)
... 9 more
Caused by: java.lang.RuntimeException
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:648)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.<init>(CqlPagingRecordReader.java:301)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:167)
at org.apache.hadoop.hive.cassandra.cql3.input.CqlHiveRecordReader.initialize(CqlHiveRecordReader.java:91)
at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:94)
... 10 more
Caused by: TimedOutException()
at org.apache.cassandra.thrift.Cassandra$execute_prepared_cql3_query_result.read(Cassandra.java:42710)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_prepared_cql3_query(Cassandra.java:1724)
at org.apache.cassandra.thrift.Cassandra$Client.execute_prepared_cql3_query(Cassandra.java:1709)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:637)
... 14 more
Cassandra CQLSH works with no problems... any idea??
Try increasing your replication factor to 2 for Analytics.

Resources