Is there a way to get a list of all available YARN queues from the command line, without resorting to parsing the capacity-scheduler.xml file?
I'm using Hadoop version 2.7.2
You can use the hadoop builtin mapred command-line tool
me#here.com$ mapred queue -list
======================
Queue Name : root.tenant1
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.tenant1.default
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.tenant1.users
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.tenant2
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.tenant2.default
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.tenant2.users
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
it provides a simple and nice output with hierarchy
One way is to use ResourceManager REST API, for example:
curl '<resourcemanager_host>:<http_port>/ws/v1/cluster/scheduler' | jq '.scheduler.schedulerInfo.queues.queue[] | .queueName’
will list all top level queues.
curl '<resourcemanager_host>:<http_port>/ws/v1/cluster/scheduler' | jq .
gives you all kind of information about scheduler/queues, thus using jq you can get any information out of it.
Related
Soft version as follows:
apache hbase 2.1.6
apache flink 1.13.6
apache hadoop 3.1.1
When I use the hbase-client api to access hbase, I get the following error:
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions:
Wed Sep 28 03:03:11 UTC 2022, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68532: java.io.IOException: Invalid currTagsLen -32239. Block offset: 1319713, block length: 99991, position: 42422 (without header). path=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/cd083a4a1ef04baff94ebb5aabdb8cb8/i/1f6dd8a1bc054eefbc9faa1bf625e24f
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:472)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: java.lang.IllegalStateException: Invalid currTagsLen -32239. Block offset: 1319713, block length: 99991, position: 42422 (without header). path=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/cd083a4a1ef04baff94ebb5aabdb8cb8/i/1f6dd8a1bc054eefbc9faa1bf625e24f
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.checkTagsLen(HFileReaderImpl.java:642)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readKeyValueLen(HFileReaderImpl.java:630)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl._next(HFileReaderImpl.java:1080)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.next(HFileReaderImpl.java:1097)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:208)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:120)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:653)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6581)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6745)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6518)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3155)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3404)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42190)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
... 3 more
The exception for hbase regionserver is as follows:
2022-09-28 11:19:36,019 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system started
2022-09-28 11:20:20,946 INFO [MemStoreFlusher.0] regionserver.HRegion: Flushing 1/1 column families, dataSize=1.95 MB heapSize=2.09 MB
2022-09-28 11:20:20,969 INFO [MemStoreFlusher.0] regionserver.DefaultStoreFlusher: Flushed memstore data size=1.95 MB at sequenceid=8934625 (bloomFilter=true), to=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d5468
55f/.tmp/i/2629dbae7d5e402489ef56b1c097289f
2022-09-28 11:20:20,977 INFO [MemStoreFlusher.0] regionserver.HStore: Added hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/2629dbae7d5e402489ef56b1c097289f, entries=1212, sequenceid=8934625, filesize=359.
1 K
2022-09-28 11:20:20,978 INFO [MemStoreFlusher.0] regionserver.HRegion: Finished flush of dataSize ~1.95 MB/2041026, heapSize ~2.09 MB/2190200, currentSize=0 B/0 for e63ee2269b0b076a415c5f76d546855f in 32ms, sequenceid=8934625, compaction requested=true
2022-09-28 11:20:20,986 INFO [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] regionserver.HRegion: Starting compaction of i in expose,9ffffff6,1663741391432.e63ee2269b0b076a415c5f76d546855f.
2022-09-28 11:20:20,986 INFO [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] regionserver.HStore: Starting compaction of [hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/98d0ecd1ed
7744a8a5f94923c382861e, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/30bab1682dba4721b25e58b78dd17255, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/f8
0c2f08176e417a9184f434d4300935, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/52baca576c154c26b7df3b5d126d47b8, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d5468
55f/i/7d8291d422d042de9aa43aa5b79da6ad, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/8bf3b47909ab4eeb86d8a5c283cfe942, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5
f76d546855f/i/0663d48a4ed94dbe9fdc78f6649c1eb3, hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/b80b55d744174bc882db93283cd70c71] into tmpdir=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e
63ee2269b0b076a415c5f76d546855f/.tmp, totalSize=18.9 M
2022-09-28 11:20:21,153 INFO [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] throttle.PressureAwareThroughputController: e63ee2269b0b076a415c5f76d546855f#i#compaction#637 average throughput is 122.45 MB/second, slept 0 time(s) and
total slept time is 0 ms. 0 active operations remaining, total limit is 61.86 MB/second
2022-09-28 11:20:21,159 ERROR [regionserver/bghbaseclusterdn9528:16020-shortCompactions-1664173471436] regionserver.CompactSplit: Compaction failed region=expose,9ffffff6,1663741391432.e63ee2269b0b076a415c5f76d546855f., storeName=i, priority=73, startTime=
1664335220978
java.lang.IllegalStateException: Invalid currTagsLen -9. Block offset: 1677972, block length: 161891, position: 48652 (without header). path=hdfs://cthbaseclusterpro01/apps/hbase/data/data/default/expose/e63ee2269b0b076a415c5f76d546855f/i/b80b55d744174bc88
2db93283cd70c71
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.checkTagsLen(HFileReaderImpl.java:642)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readKeyValueLen(HFileReaderImpl.java:630)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl._next(HFileReaderImpl.java:1080)
at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.next(HFileReaderImpl.java:1097)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:208)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:120)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:653)
at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:388)
at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:327)
at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1410)
at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2187)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:596)
at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:638)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2022-09-28 11:20:25,000 INFO [RpcServer.default.FPBQ.Fifo.handler=18,queue=3,port=16020] regionserver.HRegion: writing data to region expose,9ffffff6,1663741391432.e63ee2269b0b076a415c5f76d546855f. with WAL disabled. Data may be lost in the event of a cra
sh.
2022-09-28 11:24:01,565 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=1.08 GB, freeSize=2.52 GB, max=3.60 GB, blockCount=17155, accesses=133155383, hits=132992986, hitRatio=99.88%, , cachingAccesses=132985682, cachingHits=132951576, cac
hingHitsRatio=99.97%, evictions=16199, evicted=0, evictedPerRun=0.0
2022-09-28 11:24:01,569 INFO [MobFileCache #0] mob.MobFileCache: MobFileCache Statistics, access: 0, miss: 0, hit: 0, hit ratio: 0%, evicted files: 0
2022-09-28 11:24:05,246 INFO [regionserver/bghbaseclusterdn9528:16020.logRoller] wal.AbstractFSWAL: Rolled WAL /apps/hbase/data/WALs/bghbaseclusterdn9528,16020,1664173440239/bghbaseclusterdn9528%2C16020%2C1664173440239.1664331845190 with entries=21, files
ize=5.39 KB; new WAL /apps/hbase/data/WALs/bghbaseclusterdn9528,16020,1664173440239/bghbaseclusterdn9528%2C16020%2C1664173440239.1664335445235
I found some solutions in code. such as HBASE-21507、HBASE-24515、HBASE-21775
Need help in resolving below issue.
I have installed Ubuntu as Windows subsystem on Windows 10.
Installed Hadoop 3.1.3 and Hive 3.1.2
When I am running normal query without MapReduce its running fine.
hive> use bhudwh;
OK
Time taken: 1.075 seconds
hive> select id from matches where id < 5;
OK
1
2
3
4
Time taken: 6.012 seconds, Fetched: 4 row(s)
hive>
When running MapReduce query, it throws error - Error: Could not find or load main class 1600.
hive> select distinct id from matches;
Query ID = bhush_20200529144705_62bc4f10-1604-453f-a90c-ed905c9c1fe9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1590670326852_0003, Tracking URL = http://DESKTOP-EU9VK4S.localdomain:8088/proxy/application_1590670326852_0003/
Kill Command = /mnt/e/Study/Hadoop/hadoop-3.1.3/bin/mapred job -kill job_1590670326852_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2020-05-29 14:47:24,644 Stage-1 map = 0%, reduce = 0%
2020-05-29 14:47:41,549 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1590670326852_0003 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1590670326852_0003_m_000000 (and more) from job job_1590670326852_0003
Task with the most failures(4):
-----
Task ID:
task_1590670326852_0003_m_000000
URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1590670326852_0003&tipid=task_1590670326852_0003_m_000000
-----
Diagnostic Messages for this Task:
[2020-05-29 14:47:40.355]Exception from container-launch.
Container id: container_1590670326852_0003_01_000005
Exit code: 1
[2020-05-29 14:47:40.360]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
[2020-05-29 14:47:40.361]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
hive>
Below are few lines from Hadoop logs.
2020-05-29 14:47:28,262 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 1
2020-05-29 14:47:28,262 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1590670326852_0003_m_000000_0: [2020-05-29 14:47:27.559]Exception from container-launch.
Container id: container_1590670326852_0003_01_000002
Exit code: 1
[2020-05-29 14:47:27.565]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
[2020-05-29 14:47:27.566]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class 1600
I have tried all the configuration changes suggested in different threads but its not working.
I have also checked Hadoop MapReduce example of WordCount and it also fails with same error.
All Hadoop processes seems running fine. Output of jps command.
9473 NodeManager
11798 Jps
9096 ResourceManager
8554 DataNode
8331 NameNode
8827 SecondaryNameNode
Please suggest how to resolve this error.
It looks like the start command of the MapReduce task contains an illegal option 1600. You need to check whether exists an illegal configuration yarn.app.mapreduce.am.command-opts with a value of 1600 in your yarn-site.xml.
Below is my requirement:
Input:
0104919 ,08476,48528,2016,2016-08-29
00104919 ,08476,48528,2016,2016-09-05
00104919 ,08476,48528,2016,2016-09-12
00104919 ,08476,48528,2017,2016-08-29
Output after join should be:
2,00104919 ,08476,48528,2016,2016-09-05,2016-09-12
3,00104919 ,08476,48528,2016,2016-09-12,2016-08-29
Below is my code:
TABL = LOAD '/TABL/part-r-00000' using PigStorage('~') AS (a,b,c,d,e,f);
pre_Q1 = FOREACH TABL generate a,b,c,d,e;
DIST = DISTINCT pre_Q1;
ORDR = ORDER DIST BY *;
Q1 = rank ORDR;
Q2 = FOREACH Q1 GENERATE rank_ORDR + 1 AS rank_Q2, a, b, c, d, e;
Q_join = join Q2 by (rank_Q2, a, b, c, d), Q1 by (rank_ORDR, a, b, c, d);
C = limit Q_join 100;
dump C;
I am getting the below error.
Can someone point out what must be causing the below error.
Failed Jobs:
JobId Alias Feature Message Outputs
job_1474127474437_528208 C,Q2,Q_join HASH_JOIN Message: Job failed!
Input(s):
Successfully read 5235587 records (1516199217 bytes) from: "/TABL/part-r-00000"
Output(s):
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1474127474437_528166 -> job_1474127474437_528185,
job_1474127474437_528185 -> job_1474127474437_528190,
job_1474127474437_528190 -> job_1474127474437_528204,
job_1474127474437_528204 -> job_1474127474437_528206,
job_1474127474437_528206 -> job_1474127474437_528208,
job_1474127474437_528208 -> null,
null
2017-01-04 04:02:37,407 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,569 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,729 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,887 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Some jobs have failed! Stop running all dependent jobs
2017-01-04 04:02:37,945 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias C
Details at logfile: /var/log/gphd/pig/pig.log
Try to modify the first line as below :
TABL = LOAD '/TABL/part-r-00000' using PigStorage(',') AS (a,b,c,d,e,f);
And watch out to the space at the end of the column a, it may affect the join !
I had to restart the master elasticsearch, the status was red, then after some time the status went yellow (primary shards get assigned).
Now when I'm doing the query curl http://x.x.x.x/_cluster/health?pretty I can see that the "number_of_pending_tasks" keeps increasing (now it is at 200k)
I had a look at the pending tasks and I can see that it is mainly this tasks that get buffered:
, {
"insert_order" : 58176,
"priority" : "NORMAL",
"source" : "indices_store",
"executing" : false,
"time_in_queue_millis" : 619596,
"time_in_queue" : "10.3m"
},
In the meantime I get the error about a rejected execution due to the queue capacity:
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 200) on org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler#34c87ed9
How can I solve this?
I have an external table in hive that is stored on my hadoop cluster and I want to move its contents into an external table that is stored on Amazon s3.
So I created an s3 backed table like so:
CREATE EXTERNAL TABLE IF NOT EXISTS export.export_table
like table_to_be_exported
ROW FORMAT SERDE ...
with SERDEPROPERTIES ('fieldDelimiter'='|')
STORED AS TEXTFILE
LOCATION 's3a://bucket/folder';
Then I run: INSERT INTO export.export_table SELECT * FROM table_to_be_exported
It outputs the following
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
WARN : Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
INFO : Starting Job = job_1435176004514_0028, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1435176004514_0028/
INFO : Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1435176004514_0028
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
INFO : 2015-07-06 09:22:18,379 Stage-1 map = 0%, reduce = 0%
INFO : 2015-07-06 09:22:27,795 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.9 sec
INFO : MapReduce Total cumulative CPU time: 2 seconds 900 msec
INFO : Ended Job = job_1435176004514_0028
INFO : Stage-4 is selected by condition resolver.
INFO : Stage-3 is filtered out by condition resolver.
INFO : Stage-5 is filtered out by condition resolver.
INFO : Moving data to: s3a://bucket/folder/.hive-staging_hive_2015-07-06_09-22-10_351_9216807769834089982-3/-ext-10000 from s3a://bucket/folder/.hive-staging_hive_2015-07-06_09-22-10_351_9216807769834089982-3/-ext-10002
ERROR : Failed with exception Wrong FS: s3a://bucket/folder/.hive-staging_hive_2015-07-06_09-22-10_351_9216807769834089982-3/-ext-10002, expected: hdfs://quickstart.cloudera:8020
java.lang.IllegalArgumentException: Wrong FS: s3a://bucket/folder/.hive-staging_hive_2015-07-06_09-22-10_351_9216807769834089982-3/-ext-10002, expected: hdfs://quickstart.cloudera:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:1916)
at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:262)
at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1187)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2449)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask (state=08S01,code=1)
I have s3a key and secret set in my hadoop core-site.xml and am able to do reads and writes from s3 using hadoop directly hdfs dfs -ls s3a://.
Any guesses as to what I could do to get this to work?
Try using s3 instead of s3a, my guess is that s3a is not supported yet in EMR's Hive distribution.