Is hive.exec.parallel broken? - hadoop

Apparently, there is a reason why hive.exec.parallel is false by default.
When I set it to true (as recommended by an answer to my previous question), my process dies with this message:
MapReduce Jobs Launched:
Job 0: Map: 2 Reduce: 1 Cumulative CPU: 6.43 sec HDFS Read: 556 HDFS Write: 96 SUCCESS
Job 1: Map: 1 Reduce: 1 Cumulative CPU: 3.15 sec HDFS Read: 475 HDFS Write: 96 SUCCESS
Job 2: Map: 1 Reduce: 1 Cumulative CPU: 3.36 sec HDFS Read: 475 HDFS Write: 96 SUCCESS
Job 3: Map: 1 Reduce: 1 Cumulative CPU: 2.19 sec HDFS Read: 475 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 15 seconds 130 msec
OK
normalized_keyword pixel_id count sum_log events
Time taken: 72.419 seconds
...
14.98user 0.62system 1:16.79elapsed 20%CPU (0avgtext+0avgdata 851392maxresident)k
8inputs+2096outputs (0major+83271minor)pagefaults 0swaps
text: java.io.EOFException
at java.io.DataInputStream.readShort(Unknown Source)
at org.apache.hadoop.fs.shell.Display$Text.getInputStream(Display.java:113)
at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
No useful data is produced. set hive.exec.parallel.thread.number=2 has no effect (same failure)
Suggestions?
EDIT: hive --version does not work, but when it starts, it prints
Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-common-0.10.0-cdh4.4.0.jar!/hive-log4j.properties
so, I guess, the version is 0.10.0.

Related

Can't add node to the cockroachde cluster

I'm staking to join a CockroachDB node to a cluster.
I've created first cluster, then try to join 2nd node to the first node, but 2nd node created new cluster as follows.
Does anyone knows whats are wrong steps on the following my steps, any suggestions are wellcome.
I've started first node as follows:
cockroach start --insecure --advertise-host=163.172.156.111
* Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
*
CockroachDB node starting at 2019-05-11 01:11:15.45522036 +0000 UTC (took 2.5s)
build: CCL v19.1.0 # 2019/04/29 18:36:40 (go1.11.6)
webui: http://163.172.156.111:8080
sql: postgresql://root#163.172.156.111:26257?sslmode=disable
client flags: cockroach <client cmd> --host=163.172.156.111:26257 --insecure
logs: /home/ueda/cockroach-data/logs
temp dir: /home/ueda/cockroach-data/cockroach-temp449555924
external I/O path: /home/ueda/cockroach-data/extern
store[0]: path=/home/ueda/cockroach-data
status: initialized new cluster
clusterID: 3e797faa-59a1-4b0d-83b5-36143ddbdd69
nodeID: 1
Then, start secondary node to join to 163.172.156.111, but can't join:
cockroach start --insecure --advertise-addr=128.199.127.164 --join=163.172.156.111:26257
CockroachDB node starting at 2019-05-11 01:21:14.533097432 +0000 UTC (took 0.8s)
build: CCL v19.1.0 # 2019/04/29 18:36:40 (go1.11.6)
webui: http://128.199.127.164:8080
sql: postgresql://root#128.199.127.164:26257?sslmode=disable
client flags: cockroach <client cmd> --host=128.199.127.164:26257 --insecure
logs: /home/ueda/cockroach-data/logs
temp dir: /home/ueda/cockroach-data/cockroach-temp067740997
external I/O path: /home/ueda/cockroach-data/extern
store[0]: path=/home/ueda/cockroach-data
status: restarted pre-existing node
clusterID: a14e89a7-792d-44d3-89af-7037442eacbc
nodeID: 1
The cockroach.log of joining node shows some gosip error:
cat cockroach-data/logs/cockroach.log
I190511 01:21:13.762309 1 util/log/clog.go:1199 [config] file created at: 2019/05/11 01:21:13
I190511 01:21:13.762309 1 util/log/clog.go:1199 [config] running on machine: amfortas
I190511 01:21:13.762309 1 util/log/clog.go:1199 [config] binary: CockroachDB CCL v19.1.0 (x86_64-unknown-linux-gnu, built 2019/04/29 18:36:40, go1.11.6)
I190511 01:21:13.762309 1 util/log/clog.go:1199 [config] arguments: [cockroach start --insecure --advertise-addr=128.199.127.164 --join=163.172.156.111:26257]
I190511 01:21:13.762309 1 util/log/clog.go:1199 line format: [IWEF]yymmdd hh:mm:ss.uuuuuu goid file:line msg utf8=✓
I190511 01:21:13.762307 1 cli/start.go:1033 logging to directory /home/ueda/cockroach-data/logs
W190511 01:21:13.763373 1 cli/start.go:1068 RUNNING IN INSECURE MODE!
- Your cluster is open for any client that can access <all your IP addresses>.
- Any user, even root, can log in without providing a password.
- Any user, connecting as root, can read or write any data in your cluster.
- There is no network encryption nor authentication, and thus no confidentiality.
Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
I190511 01:21:13.763675 1 server/status/recorder.go:610 available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
W190511 01:21:13.763752 1 cli/start.go:944 Using the default setting for --cache (128 MiB).
A significantly larger value is usually needed for good performance.
If you have a dedicated server a reasonable setting is --cache=.25 (248 MiB).
I190511 01:21:13.764011 1 server/status/recorder.go:610 available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
W190511 01:21:13.764047 1 cli/start.go:957 Using the default setting for --max-sql-memory (128 MiB).
A significantly larger value is usually needed in production.
If you have a dedicated server a reasonable setting is --max-sql-memory=.25 (248 MiB).
I190511 01:21:13.764239 1 server/status/recorder.go:610 available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
I190511 01:21:13.764272 1 cli/start.go:1082 CockroachDB CCL v19.1.0 (x86_64-unknown-linux-gnu, built 2019/04/29 18:36:40, go1.11.6)
I190511 01:21:13.866977 1 server/status/recorder.go:610 available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
I190511 01:21:13.867002 1 server/config.go:386 system total memory: 992 MiB
I190511 01:21:13.867063 1 server/config.go:388 server configuration:
max offset 500000000
cache size 128 MiB
SQL memory pool size 128 MiB
scan interval 10m0s
scan min idle time 10ms
scan max idle time 1s
event log enabled true
I190511 01:21:13.867098 1 cli/start.go:929 process identity: uid 1000 euid 1000 gid 1000 egid 1000
I190511 01:21:13.867115 1 cli/start.go:554 starting cockroach node
I190511 01:21:13.868242 21 storage/engine/rocksdb.go:613 opening rocksdb instance at "/home/ueda/cockroach-data/cockroach-temp067740997"
I190511 01:21:13.894320 21 server/server.go:876 [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
I190511 01:21:13.894813 21 storage/engine/rocksdb.go:613 opening rocksdb instance at "/home/ueda/cockroach-data"
W190511 01:21:13.896301 21 storage/engine/rocksdb.go:127 [rocksdb] [/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb/db/version_set.cc:2566] More existing levels in DB than needed. max_bytes_for_level_multiplier may not be guaranteed.
W190511 01:21:13.905666 21 storage/engine/rocksdb.go:127 [rocksdb] [/go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb/db/version_set.cc:2566] More existing levels in DB than needed. max_bytes_for_level_multiplier may not be guaranteed.
I190511 01:21:13.911380 21 server/config.go:494 [n?] 1 storage engine initialized
I190511 01:21:13.911417 21 server/config.go:497 [n?] RocksDB cache size: 128 MiB
I190511 01:21:13.911427 21 server/config.go:497 [n?] store 0: RocksDB, max size 0 B, max open file limit 10000
W190511 01:21:13.912459 21 gossip/gossip.go:1496 [n?] no incoming or outgoing connections
I190511 01:21:13.913206 21 server/server.go:926 [n?] Sleeping till wall time 1557537673913178595 to catches up to 1557537674394265598 to ensure monotonicity. Delta: 481.087003ms
I190511 01:21:14.251655 65 vendor/github.com/cockroachdb/circuitbreaker/circuitbreaker.go:322 [n?] circuitbreaker: gossip [::]:26257->163.172.156.111:26257 tripped: initial connection heartbeat failed: rpc error: code = Unknown desc = client cluster ID "a14e89a7-792d-44d3-89af-7037442eacbc" doesn't match server cluster ID "3e797faa-59a1-4b0d-83b5-36143ddbdd69"
I190511 01:21:14.251695 65 vendor/github.com/cockroachdb/circuitbreaker/circuitbreaker.go:447 [n?] circuitbreaker: gossip [::]:26257->163.172.156.111:26257 event: BreakerTripped
W190511 01:21:14.251763 65 gossip/client.go:122 [n?] failed to start gossip client to 163.172.156.111:26257: initial connection heartbeat failed: rpc error: code = Unknown desc = client cluster ID "a14e89a7-792d-44d3-89af-7037442eacbc" doesn't match server cluster ID "3e797faa-59a1-4b0d-83b5-36143ddbdd69"
I190511 01:21:14.395848 21 gossip/gossip.go:392 [n1] NodeDescriptor set to node_id:1 address:<network_field:"tcp" address_field:"128.199.127.164:26257" > attrs:<> locality:<> ServerVersion:<major_val:19 minor_val:1 patch:0 unstable:0 > build_tag:"v19.1.0" started_at:1557537674395557548
W190511 01:21:14.458176 21 storage/replica_range_lease.go:506 can't determine lease status due to node liveness error: node not in the liveness table
I190511 01:21:14.458465 21 server/node.go:461 [n1] initialized store [n1,s1]: disk (capacity=24 GiB, available=18 GiB, used=2.2 MiB, logicalBytes=41 MiB), ranges=20, leases=0, queries=0.00, writes=0.00, bytesPerReplica={p10=0.00 p25=0.00 p50=0.00 p75=6467.00 p90=26940.00 pMax=43017435.00}, writesPerReplica={p10=0.00 p25=0.00 p50=0.00 p75=0.00 p90=0.00 pMax=0.00}
I190511 01:21:14.458775 21 storage/stores.go:244 [n1] read 0 node addresses from persistent storage
I190511 01:21:14.459095 21 server/node.go:699 [n1] connecting to gossip network to verify cluster ID...
W190511 01:21:14.469842 96 storage/store.go:1525 [n1,s1,r6/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r6: replica (n1,s1):1 not lease holder; lease holder unknown
I190511 01:21:14.474785 21 server/node.go:719 [n1] node connected via gossip and verified as part of cluster "a14e89a7-792d-44d3-89af-7037442eacbc"
I190511 01:21:14.475033 21 server/node.go:542 [n1] node=1: started with [<no-attributes>=/home/ueda/cockroach-data] engine(s) and attributes []
I190511 01:21:14.475393 21 server/status/recorder.go:610 [n1] available memory from cgroups (8.0 EiB) exceeds system memory 992 MiB, using system memory
I190511 01:21:14.475514 21 server/server.go:1582 [n1] starting http server at [::]:8080 (use: 128.199.127.164:8080)
I190511 01:21:14.475572 21 server/server.go:1584 [n1] starting grpc/postgres server at [::]:26257
I190511 01:21:14.475605 21 server/server.go:1585 [n1] advertising CockroachDB node at 128.199.127.164:26257
W190511 01:21:14.475655 21 jobs/registry.go:341 [n1] unable to get node liveness: node not in the liveness table
I190511 01:21:14.532949 21 server/server.go:1650 [n1] done ensuring all necessary migrations have run
I190511 01:21:14.533020 21 server/server.go:1653 [n1] serving sql connections
I190511 01:21:14.533209 21 cli/start.go:689 [config] clusterID: a14e89a7-792d-44d3-89af-7037442eacbc
I190511 01:21:14.533257 21 cli/start.go:697 node startup completed:
CockroachDB node starting at 2019-05-11 01:21:14.533097432 +0000 UTC (took 0.8s)
build: CCL v19.1.0 # 2019/04/29 18:36:40 (go1.11.6)
webui: http://128.199.127.164:8080
sql: postgresql://root#128.199.127.164:26257?sslmode=disable
client flags: cockroach <client cmd> --host=128.199.127.164:26257 --insecure
logs: /home/ueda/cockroach-data/logs
temp dir: /home/ueda/cockroach-data/cockroach-temp067740997
external I/O path: /home/ueda/cockroach-data/extern
store[0]: path=/home/ueda/cockroach-data
status: restarted pre-existing node
clusterID: a14e89a7-792d-44d3-89af-7037442eacbc
nodeID: 1
I190511 01:21:14.541205 146 server/server_update.go:67 [n1] no need to upgrade, cluster already at the newest version
I190511 01:21:14.555557 149 sql/event_log.go:135 [n1] Event: "node_restart", target: 1, info: {Descriptor:{NodeID:1 Address:128.199.127.164:26257 Attrs: Locality: ServerVersion:19.1 BuildTag:v19.1.0 StartedAt:1557537674395557548 LocalityAddress:[] XXX_NoUnkeyedLiteral:{} XXX_sizecache:0} ClusterID:a14e89a7-792d-44d3-89af-7037442eacbc StartedAt:1557537674395557548 LastUp:1557537671113461486}
I190511 01:21:14.916458 59 gossip/gossip.go:1510 [n1] node has connected to cluster via gossip
I190511 01:21:14.916660 59 storage/stores.go:263 [n1] wrote 0 node addresses to persistent storage
I190511 01:21:24.480247 116 storage/store.go:4220 [n1,s1] sstables (read amplification = 2):
0 [ 51K 1 ]: 51K
6 [ 1M 1 ]: 1M
I190511 01:21:24.480380 116 storage/store.go:4221 [n1,s1]
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
L0 1/0 50.73 KB 0.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 8.0 0 1 0.006 0 0
L6 1/0 1.26 MB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0.000 0 0
Sum 2/0 1.31 MB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 8.0 0 1 0.006 0 0
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 8.0 0 1 0.006 0 0
Uptime(secs): 10.6 total, 10.6 interval
Flush(GB): cumulative 0.000, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
estimated_pending_compaction_bytes: 0 B
I190511 01:21:24.481565 121 server/status/runtime.go:500 [n1] runtime stats: 170 MiB RSS, 114 goroutines, 0 B/0 B/0 B GO alloc/idle/total, 14 MiB/16 MiB CGO alloc/total, 0.0 CGO/sec, 0.0/0.0 %(u/s)time, 0.0 %gc (7x), 50 KiB/1.5 MiB (r/w)net
What is the possibly cause to block to join? Thank you for your suggestion!
It seems you had previously started the second node (the one running on 128.199.127.164) by itself, creating its own cluster.
This can be seen in the error message:
W190511 01:21:14.251763 65 gossip/client.go:122 [n?] failed to start gossip client to 163.172.156.111:26257: initial connection heartbeat failed: rpc error: code = Unknown desc = client cluster ID "a14e89a7-792d-44d3-89af-7037442eacbc" doesn't match server cluster ID "3e797faa-59a1-4b0d-83b5-36143ddbdd69"
To be able to join the cluster, the data directory of the joining node must be empty. You can either delete cockroach-data or specify an alternate directory with --store=/path/to/data-dir

Performance issues of small files on Hive

I was reading an article regarding how small files degrade the performance of the hive query.
https://community.hitachivantara.com/community/products-and-solutions/pentaho/blog/2017/11/07/working-with-small-files-in-hadoop-part-1
I understand the first part regarding overloading the NameNode.
However, what he had said regrading map-reduce doesn't seem to happen. for both map-reduce and Tez.
When a MapReduce job launches, it schedules one map task per block of
data being processed
I don't see mapper task created per file.May the reason is, he is referring the version 1 of map-reduce and so much change haver been done after that.
Hive Version: Hive 1.2.1000.2.6.4.0-91
My table:
create table temp.emp_orc_small_files (id int, name string, salary int)
stored as orcfile;
Data:
following code will create 100 small files it containing only few kb of data.
for i in {1..100}; do hive -e "insert into temp.emp_orc_small_files values(${i}, 'test_${i}', `shuf -i 1000-5000 -n 1`);";done
However I see only one mapper and one reducer task being created for following query.
[root#sandbox-hdp ~]# hive -e "select max(salary) from temp.emp_orc_small_files"
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.
Logging initialized using configuration in file:/etc/hive/2.6.4.0-91/0/hive-log4j.properties
Query ID = root_20180911200039_9e1361cb-0a5d-45a3-9c98-4aead46905ac
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1536258296893_0257)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 7.36 s
--------------------------------------------------------------------------------
OK
4989
Time taken: 13.643 seconds, Fetched: 1 row(s)
Same result with map-reduce.
hive> set hive.execution.engine=mr;
hive> select max(salary) from temp.emp_orc_small_files;
Query ID = root_20180911200545_c4f63cc6-0ab8-4bed-80fe-b4cb545018f2
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1536258296893_0259, Tracking URL = http://sandbox-hdp.hortonworks.com:8088/proxy/application_1536258296893_0259/
Kill Command = /usr/hdp/2.6.4.0-91/hadoop/bin/hadoop job -kill job_1536258296893_0259
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2018-09-11 20:05:57,213 Stage-1 map = 0%, reduce = 0%
2018-09-11 20:06:04,727 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 4.37 sec
2018-09-11 20:06:12,189 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.36 sec
MapReduce Total cumulative CPU time: 7 seconds 360 msec
Ended Job = job_1536258296893_0259
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 7.36 sec HDFS Read: 66478 HDFS Write: 5 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 360 msec
OK
4989
This is because the following configuration is taking effect
hive.hadoop.supports.splittable.combineinputformat
from the documentation
Whether to combine small input files so that fewer mappers are
spawned.
So essentially Hive can infer that the input is a group of small files smaller than the blocksize and combine them reducing the required number of mappers.

Hive action fails on Oozie, while works well on hive commandline

Here is my workflow, works well if I have sample SQL like show tables or drop partition(Tried with Tez as well, it fials with cryptic error message again)..
workflow-app xmlns="uri:oozie:workflow:0.4" name="UDEX-OOZIE POC">
<credentials>
<credential name="HiveCred" type="hcat">
<property>
<name>hcat.metastore.uri</name>
<value>thrift://xxxx.local:9083</value>
</property>
<property>
<name>hcat.metastore.principal</name>
<value>hive/_HOST#xxxx.LOCAL</value>
</property>
</credential>
</credentials>
<start to="IdEduTranCell-pm"/>
<action name="IdEduTranCell-pm" cred="HiveCred">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>${HiveConfigFile}</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<script>${IdEduTranCell_path}</script>
<param>SOURCE_DB_NAME=${SOURCE_DB_NAME}</param>
<param>STRG_DB_NAME=${STRG_DB_NAME}</param>
<param>TABLE_NAME=${TABLE_NAME}</param>
<file>${IdEduTranCell_path}#${IdEduTranCell_path}</file>
<file>${HiveConfigFile}#${HiveConfigFile}</file>
</hive>
<ok to="sub-workflow-end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Sub-workflow failed while loading data into hive tables, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="sub-workflow-end"/>
</workflow-app>
But fails for a SQL . Data is not too large(fails for even 1 record), so something wrong that i cant spot on log.. please help
INSERT OVERWRITE TABLE xxx1.fact_tranCell
PARTITION (Timestamp)
select
`(Timestamp)?+.+`, Timestamp as Timestamp
from xxx2.fact_tranCell
order by tranCell_ID,ADMIN_CELL_ID, SITE_ID;
SQL is not bad, runs fine on command line..
Status: Running (Executing on YARN cluster with App id application_1459756606292_15271)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 7 7 0 0 0 0
Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
Reducer 3 ...... SUCCEEDED 10 10 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 03/03 [==========================>>] 100% ELAPSED TIME: 19.72 s
--------------------------------------------------------------------------------
Loading data to table xxx1.fact_trancell partition (timestamp=null)
Time taken for load dynamic partitions : 496
Loading partition {timestamp=1464012900}
Time taken for adding to write entity : 8
Partition 4g_oss.fact_trancell{timestamp=1464012900} stats: [numFiles=10, numRows=4352, totalSize=9660382, rawDataSize=207776027]
OK
Time taken: 34.595 seconds
--------------------------- LOG ----------------------------
Starting Job = job_1459756606292_15285, Tracking URL = hxxp://xxxx.local:8088/proxy/application_1459756606292_15285/
Kill Command = /usr/bin/hadoop job -kill job_1459756606292_15285
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-05-27 17:32:35,792 Stage-1 map = 0%, reduce = 0%
2016-05-27 17:32:51,692 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 9.97 sec
2016-05-27 17:33:02,263 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.97 sec
MapReduce Total cumulative CPU time: 14 seconds 970 msec
Ended Job = job_1459756606292_15285
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1459756606292_15286, Tracking URL = hxxp://xxxx.local:8088/proxy/application_1459756606292_15286/
Kill Command = /usr/bin/hadoop job -kill job_1459756606292_15286
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
2016-05-27 17:33:16,583 Stage-2 map = 0%, reduce = 0%
2016-05-27 17:33:29,814 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 4.29 sec
2016-05-27 17:33:45,587 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 38.74 sec
2016-05-27 17:33:53,990 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 4.29 sec
2016-05-27 17:34:08,662 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 39.27 sec
2016-05-27 17:34:17,061 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 4.29 sec
2016-05-27 17:34:28,576 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 38.28 sec
2016-05-27 17:34:36,940 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 4.29 sec
2016-05-27 17:34:48,435 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 38.09 sec
MapReduce Total cumulative CPU time: 38 seconds 90 msec
Ended Job = job_1459756606292_15286 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://xxxx.local:8088/proxy/application_1459756606292_15286/
Examining task ID: task_1459756606292_15286_m_000000 (and more) from job job_1459756606292_15286
Task with the most failures(4):
-----
Task ID:
task_1459756606292_15286_r_000000
URL:
hxxp://xxxx.local:8088/taskdetails.jsp?jobid=job_1459756606292_15286&tipid=task_1459756606292_15286_r_000000
-----
Diagnostic Messages for this Task:
Error: Java heap space
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 14.97 sec HDFS Read: 87739161 HDFS Write: 16056577 SUCCESS
Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 38.09 sec HDFS Read: 16056995 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 53 seconds 60 msec
Intercepting System.exit(2)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [2]
I could see the error in the log is while writing the ORC files, strange the ORC files are written nicely form command line !!!!
2016-05-30 11:11:20,377 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 1
2016-05-30 11:11:21,307 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 10
2016-05-30 11:11:21,917 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 100
2016-05-30 11:11:22,420 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 1000
2016-05-30 11:11:24,181 INFO [main] org.apache.hadoop.hive.ql.exec.ExtractOperator: 0 finished. closing...
2016-05-30 11:11:24,181 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: 1 finished. closing...
2016-05-30 11:11:24,181 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 4352
2016-05-30 11:11:33,028 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107)
at org.apache.hadoop.hive.ql.io.orc.OutStream.write(OutStream.java:128)
at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.writeDirectValues(RunLengthIntegerWriterV2.java:374)
at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.writeValues(RunLengthIntegerWriterV2.java:182)
at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.write(RunLengthIntegerWriterV2.java:762)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.flushDictionary(WriterImpl.java:1211)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1132)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1616)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1996)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2288)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:186)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:952)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Hive not running Map Reduce with "where" clause

I'm trying out something simple in Hive on HDFS.
The problem is that the queries are not running map reduce when I'm running a "where clause". However, it runs map reduce for count(*), and even group by clauses.
Here's data and queries with result:
Create External Table:
CREATE EXTERNAL TABLE testtab1 (
id STRING, org STRING)
row format delimited
fields terminated by ','
stored as textfile
location '/usr/ankuchak/testtable1';
Simple select * query:
0: jdbc:hive2://> select * from testtab1;
15/07/01 07:32:46 [main]: ERROR hdfs.KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
OK
+---------------+---------------+--+
| testtab1.id | testtab1.org |
+---------------+---------------+--+
| ankur | idc |
| user | idc |
| someone else | ssi |
+---------------+---------------+--+
3 rows selected (2.169 seconds)
Count(*) query
0: jdbc:hive2://> select count(*) from testtab1;
Query ID = ankuchak_20150701073407_e7fd66ae-8812-4e02-87d7-492f81781d15
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
15/07/01 07:34:08 [HiveServer2-Background-Pool: Thread-40]: ERROR mr.ExecDriver: yarn
15/07/01 07:34:08 [HiveServer2-Background-Pool: Thread-40]: WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
Starting Job = job_1435425589664_0005, Tracking URL = http://slc02khv:8088/proxy/application_1435425589664_0005/
Kill Command = /scratch/hadoop/hadoop/bin/hadoop job -kill job_1435425589664_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
15/07/01 07:34:16 [HiveServer2-Background-Pool: Thread-40]: WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-07-01 07:34:16,291 Stage-1 map = 0%, reduce = 0%
2015-07-01 07:34:23,831 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.04 sec
2015-07-01 07:34:30,102 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.41 sec
MapReduce Total cumulative CPU time: 2 seconds 410 msec
Ended Job = job_1435425589664_0005
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.41 sec HDFS Read: 6607 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 410 msec
OK
+------+--+
| _c0 |
+------+--+
| 3 |
+------+--+
1 row selected (23.527 seconds)
Group by query:
0: jdbc:hive2://> select org, count(id) from testtab1 group by org;
Query ID = ankuchak_20150701073540_5f20df4e-0bd4-4e18-b065-44c2688ce21f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
15/07/01 07:35:40 [HiveServer2-Background-Pool: Thread-63]: ERROR mr.ExecDriver: yarn
15/07/01 07:35:41 [HiveServer2-Background-Pool: Thread-63]: WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
Starting Job = job_1435425589664_0006, Tracking URL = http://slc02khv:8088/proxy/application_1435425589664_0006/
Kill Command = /scratch/hadoop/hadoop/bin/hadoop job -kill job_1435425589664_0006
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
15/07/01 07:35:47 [HiveServer2-Background-Pool: Thread-63]: WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-07-01 07:35:47,200 Stage-1 map = 0%, reduce = 0%
2015-07-01 07:35:53,494 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.05 sec
2015-07-01 07:36:00,799 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.53 sec
MapReduce Total cumulative CPU time: 2 seconds 530 msec
Ended Job = job_1435425589664_0006
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.53 sec HDFS Read: 7278 HDFS Write: 14 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 530 msec
OK
+-------+------+--+
| org | _c1 |
+-------+------+--+
| idc | 2 |
| ssi | 1 |
+-------+------+--+
2 rows selected (21.187 seconds)
Now the simple where clause:
0: jdbc:hive2://> select * from testtab1 where org='idc';
OK
+--------------+---------------+--+
| testtab1.id | testtab1.org |
+--------------+---------------+--+
+--------------+---------------+--+
No rows selected (0.11 seconds)
It would be great if you could provide me with some pointers.
Please let me know if you need further information in this regard.
Regards,
Ankur
Map job is occuring in your last query. So it's not that map reduce is not happening. However, some rows should be returned in your last query. The likely culprit here is that for some reason it is not finding a match on the value "idc". Check your table and ensure that the group for Ankur and user contain the string idc.
Try this to see if you get any results:
Select * from testtab1 where org rlike '.*(idc).*';
or
Select * from testtab1 where org like '%idc%';
These will grab any row that has a value containing the string 'idc'. Good luck!
Here, details of the same error and fixed recently. Try verifying the version you are using

Hive failed with java.io.IOException(Max block location exceeded for split .... splitsize: 45 maxsize: 10)

The hive does need to process 45 files. Each size is about 1GB. After the mapper execution finished 100%, hive Failed with the error message above.
Driver returned: 1. Errors: OK
Hive history file=/tmp/hue/hive_job_log_hue_201308221004_1738621649.txt
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1376898282169_0441, Tracking URL = http://SH02SVR2882.hadoop.sh2.ctripcorp.com:8088/proxy/application_1376898282169_0441/
Kill Command = //usr/lib/hadoop/bin/hadoop job -kill job_1376898282169_0441
Hadoop job information for Stage-1: number of mappers: 236; number of reducers: 0
2013-08-22 10:04:40,205 Stage-1 map = 0%, reduce = 0%
2013-08-22 10:05:07,486 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 121.28 sec
.......................
2013-08-22 10:09:18,625 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 7707.18 sec
MapReduce Total cumulative CPU time: 0 days 2 hours 8 minutes 27 seconds 180 msec
Ended Job = job_1376898282169_0441
Ended Job = -541447549, job is filtered out (removed at runtime).
Ended Job = -1652692814, job is filtered out (removed at runtime).
Launching Job 3 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job Submission failed with exception
'java.io.IOException(Max block location exceeded for split: Paths:/tmp/hive-beeswax-logging/hive_2013-08-22_10-04-32_755_6427103839442439579/-ext-10001/000009_0:0+28909,....,/tmp/hive-beeswax-logging/hive_2013-08-22_10-04-32_755_6427103839442439579/-ext-10001/000218_0:0+45856
Locations:10.8.75.17:...:10.8.75.20:; InputFormatClass: org.apache.hadoop.mapred.TextInputFormat
splitsize: 45 maxsize: 10)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 236 Cumulative CPU: 7707.18 sec HDFS Read: 63319449229 HDFS Write: 8603165 SUCCESS
Total MapReduce CPU Time Spent: 0 days 2 hours 8 minutes 27 seconds 180 msec
But I didn't set the maxsize. Executed many times but got the same error. I tried to add mapreduce.jobtracker.split.metainfo.maxsize property for hive. But in this case, hive failed without any map work.
set mapreduce.job.max.split.locations > 45
In our situation it solved problem.

Resources