Pivot Table in Hive and Create Multiple Columns for Unique Combinations - hadoop

I want to pivot the following table
| ID | Code | date | qty |
| 1 | A | 1/1/19 | 11 |
| 1 | A | 2/1/19 | 12 |
| 2 | B | 1/1/19 | 13 |
| 2 | B | 2/1/19 | 14 |
| 3 | C | 1/1/19 | 15 |
| 3 | C | 3/1/19 | 16 |
into
| ID | Code | mth_1(1/1/19) | mth_2(2/1/19) | mth_3(3/1/19) |
| 1 | A | 11 | 12 | 0 |
| 2 | B | 13 | 14 | 0 |
| 3 | C | 15 | 0 | 16 |
I am new to hive, i am not sure how to implement it.
NOTE: I don't want to do mapping because my month values change over time.

Related

Buffers used by WINDOW SORT operation

I have a query with the following execution plan:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 21741 |00:00:11.38 | 150K| 1088 | | | |
| 1 | SORT AGGREGATE | | 46072 | 1 | 46072 |00:00:02.92 | 138K| 241 | | | |
| 2 | FIRST ROW | | 46072 | 1 | 3761 |00:00:02.83 | 138K| 241 | | | |
|* 3 | INDEX RANGE SCAN (MIN/MAX) | VERP_VIG_VEHICLE_STAGES_N2 | 46072 | 1 | 3761 |00:00:02.79 | 138K| 241 | | | |
|* 4 | HASH JOIN RIGHT OUTER | | 1 | 37010 | 21741 |00:00:11.38 | 150K| 1088 | 3272K| 1218K| 3302K (0)|
| 5 | VIEW | | 1 | 7402 | 23548 |00:00:11.17 | 147K| 1088 | | | |
| 6 | WINDOW SORT | | 1 | 7402 | 23548 |00:00:10.82 | 79621 | 1088 | 4801K| 915K| 4267K (0)|
|* 7 | HASH JOIN RIGHT OUTER | | 1 | 7402 | 23548 |00:00:07.84 | 8837 | 847 | 1599K| 1599K| 996K (0)|
| 8 | TABLE ACCESS FULL | VERP_OTM_PS_CONTROL_TABLE | 1 | 5 | 5 |00:00:00.01 | 39 | 0 | | | |
|* 9 | FILTER | | 1 | | 23548 |00:00:07.80 | 8798 | 847 | | | |
|* 10 | HASH JOIN RIGHT OUTER | | 1 | 7402 | 71904 |00:00:07.76 | 8798 | 847 | 1421K| 1421K| 1756K (0)|
| 11 | VIEW | | 1 | 4534 | 4554 |00:00:00.01 | 27 | 0 | | | |
|* 12 | HASH JOIN | | 1 | 4534 | 4554 |00:00:00.01 | 27 | 0 | 1888K| 1888K| 1596K (0)|
| 13 | INDEX FULL SCAN | VERP_VPS_SUPPLY_VVP_N1 | 1 | 27 | 27 |00:00:00.01 | 1 | 0 | | | |
| 14 | INDEX FULL SCAN | VERP_VPS_SUPPLY_VVVP_N1 | 1 | 4534 | 4554 |00:00:00.01 | 26 | 0 | | | |
|* 15 | HASH JOIN | | 1 | 37010 | 71904 |00:00:07.67 | 8771 | 847 | 1245K| 1245K| 1722K (0)|
| 16 | SORT UNIQUE | | 1 | 37010 | 1586 |00:00:00.05 | 3279 | 0 | 124K| 124K| 110K (0)|
| 17 | TABLE ACCESS FULL | VERP_OTM_STAGED_VONS | 1 | 37010 | 21741 |00:00:00.02 | 3279 | 0 | | | |
| 18 | TABLE ACCESS BY INDEX ROWID BATCHED| VERP_VIG_VEHICLES | 1 | 246K| 36104 |00:00:07.53 | 5492 | 847 | | | |
|* 19 | INDEX RANGE SCAN | VERP_VIG_VEHICLES_N22 | 1 | 246K| 36104 |00:00:07.38 | 891 | 838 | | | |
| 20 | VIEW | | 1 | 37010 | 21741 |00:00:00.12 | 3279 | 0 | | | |
| 21 | WINDOW SORT | | 1 | 37010 | 21741 |00:00:00.11 | 3279 | 0 | 1612K| 624K| 1432K (0)|
| 22 | TABLE ACCESS FULL | VERP_OTM_STAGED_VONS | 1 | 37010 | 21741 |00:00:00.03 | 3279 | 0 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("S"."VIN"=:B1 AND "S"."STAGE_CODE"='YARD_RECEIPT')
4 - access("VINS"."VIN_SEQUENCE"="VONS"."VON_SEQUENCE" AND "VINS"."PORT_CODE"="VONS"."PORT_CODE" AND "VINS"."INT_COLOR_CODE"="VONS"."INT_COLOR_CODE" AND
"VINS"."EXT_COLOR_CODE"="VONS"."EXT_COLOR_CODE" AND "VINS"."SPEC_CODE"="VONS"."SPEC_CODE" AND "VINS"."OPTION_CODE"="VONS"."OPTION_CODE" AND
"VINS"."MODEL_CODE"="VONS"."MODEL_CODE")
7 - access("C"."PORT"=CASE "VVV"."VEHICLE_SOURCE" WHEN 'SIA' THEN '020' ELSE "from$_subquery$_006"."PORT_CODE" END )
9 - filter("VPT"."PORT"=CASE "VVV"."VEHICLE_SOURCE" WHEN 'SIA' THEN '020' ELSE "from$_subquery$_006"."PORT_CODE" END )
10 - access("VVVP"."VESSEL_PORT_ID"="VVV"."VESSEL_PORT_ID")
12 - access("VVP"."PORT_ID"="VVVP"."PORT_ID")
15 - access("VVV"."SOA_MODEL_CODE"="VPT"."MODEL_CODE" AND "VVV"."SOA_OPTION_CODE"="VPT"."OPTION_CODE" AND "VVV"."SOA_SPEC_CODE"="VPT"."SPEC_CODE" AND
"VVV"."SOA_EXT_COLOR_CODE"="VPT"."EXT_COLOR_CODE" AND "VVV"."SOA_INT_COLOR_CODE"="VPT"."INT_COLOR_CODE")
19 - access("VVV"."PS_STATUS"='NOT_MATCHED')
I am interested to know why the WINDOW SORT operation in step #6 is requiring so many buffer gets. I usually don't see that sort of thing for a WINDOW SORT operation. For example see the operation in step 21 of the same plan -- no additional buffer gets.
Does anyone know what these buffer gets are? I suspect that maybe the sort operation is spilling to disk, due to its size, and the extra buffer gets are to access those temp tablespace blocks. I'd like confirmation or alternate explanations, as appropriate. Thanks.
UPDATE
To hopefully clarify: I want to know why step 6 required added buffer gets beyond what was required to get through step 7. I.e., why it is not like the buffer gets in step 21, which did not increase the number from what was necessary to get through step 22.

Indexes hints in a Subquery

I have a SQL statement that has performance issues.
Adding the following index and a SQL hint to use the index improves the performance 10 fold but I do not understand why.
BUS_ID is part of the primary key(T1.REF is the other part fo the key) and clustered index on the T1 table.
The T1 table has about 100,000 rows. BUS_ID has only 6 different values. Similarly the T1.STATUS column can only have a limited number of
possibilities and the majority of these(99%) will be the same value.
If I run the query without the hint(/*+ INDEX ( T1 T1_IDX1) NO_UNNEST */) it takes 5 seconds and with the hint it takes .5 seconds.
I don't understand how the index helps the subquery as T1.STATUS isn't used in any of the 'where' or 'join' clauses in the subquery.
What am I missing?
SELECT
/*+ NO_UNNEST */
t1.bus_id,
t1.ref,
t2.cust,
t3.cust_name,
t2.po_number,
t1.status_old,
t1.status,
t1.an_status
FROM t1
LEFT JOIN t2
ON t1.bus_id = t2.bus_id
AND t1.ref = t2.ref
JOIN t3
ON t3.cust = t2.cust
AND t3.bus_id = t2.bus_id
WHERE (
status IN ('A', 'B', 'C') AND status_old IN ('X', 'Y'))
AND EXISTS
( SELECT /*+ INDEX ( T1 T1_IDX1) NO_UNNEST */
*
FROM t1
WHERE ( EXISTS ( SELECT /*+ NO_UNNEST */
*
FROM t6
WHERE seq IN ( '0', '2' )
AND t1.bus_id = t6.bus_id)
OR (EXISTS
(SELECT /*+ NO_UNNEST */
*
FROM t6
WHERE seq = '1'
AND (an_status = 'Y'
OR
an_status = 'X')
AND t1.bus_id = t6.bus_id))
AND t2.ref = t1.ref))
AND USER IN ('FRED')
AND ( t2.status != '45'
AND t2.status != '20')
AND NOT EXISTS ( SELECT
/*+ NO_UNNEST */
*
FROM t4
WHERE EXISTS
(
SELECT
/*+ NO_UNNEST */
*
FROM t5
WHERE pd IN ( '1',
'0' )
AND appl = 'RYP'
AND appl_id IN ( 'RL100')
AND t4.id = t5.id)
AND t2.ref = p.ref
AND t2.bus_id = p.bus_id);
Edited to include Explain Plan and index.
Without Index hint
------------------------------------------------------|-------------------------------------
Operation | Options |Cost| # |Bytes | CPU Cost | IO COST
------------------------------------------------------|-------------------------------------
select statement | | 20 | 1 | 211 | 15534188 | 19 |
view | | 20 | 1 | 211 | 15534188 | 19 |
count | | | | | | |
view | | 20 | 1 | 198 | 15534188 | 19 |
sort | ORDER BY | 20 | 1 | 114 | 15534188 | 19 |
nested loops | | 7 | 1 | 114 | 62487 | 7 |
nested loops | | 7 | 1 | 114 | 62487 | 7 |
nested loops | | 6 | 1 | 84 | 53256 | 6 |
inlist iterator | | | | | | |
TABLE access t1 | INDEX ROWID | 4 | 1 | 29 | 36502 | 4 |
index-t1_idx#3 | RANGE SCAN | 3 | 1 | | 28686 | 3 |
TABLE access - t2 | INDEX ROWID | 2 | 1 | 55 | 16754 | 2 |
index t2_idx#0 | UNIQUE SCAN | 1 | 1 | | 9042 | 1 |
filter | | | | | | |
TABLE access-t1 | INDEX ROWID | 2 | 1 | 15 | 7433 | 2 |
TABLE access-t6 | INDEX ROWID | 3 | 1 | 4 | 23169 | 3 |
index-t6_idx#0 | UNIQUE RANGE SCAN | 1 | 3 | | 7721 | 1 |
filter | | | | | | |
TABLE access-t6 | INDEX ROWID | 2 | 2 | 8 | 15363 | 2 |
index-t6_idx#0 | UNIQUE RANGE SCAN | 1 | 3 | | 7521 | 1 |
index-t4_idx#1 | RANGE SCAN | 3 | 1 | 28 | 21584 | 3 |
inlist iterator | | | | | | |
index-t5_idx#1 | RANGE SCAN | 4 | 1 | 24 | 42929 | 4 |
index-t3_idx#0 | INDEX UNIQUE SCAN | 0 | 1 | | 1900 | 0 |
TABLE access-t3 | INDEX ROWID | 1 | 1 | 30 | 9231 | 1 |
--------------------------------------------------------------------------------------------
With Index hint
------------------------------------------------------|-------------------------------------
Operation | Options |Cost| # |Bytes | CPU Cost | IO COST
------------------------------------------------------|-------------------------------------
select statement | | 21 | 1 | 211 | 15549142 | 19 |
view | | 21 | 1 | 211 | 15549142 | 19 |
count | | | | | | |
view | | 21 | 1 | 198 | 15549142 | 19 |
sort | ORDER BY | 21 | 1 | 114 | 15549142 | 19 |
nested loops | | 7 | 1 | 114 | 62487 | 7 |
nested loops | | 7 | 1 | 114 | 62487 | 7 |
nested loops | | 6 | 1 | 84 | 53256 | 6 |
inlist iterator | | | | | | |
TABLE access t1 | INDEX ROWID | 4 | 1 | 29 | 36502 | 4 |
index-t1_idx#3 | RANGE SCAN | 3 | 1 | | 28686 | 3 |
TABLE access - t2 | INDEX ROWID | 2 | 1 | 55 | 16754 | 2 |
index t2_idx#0 | UNIQUE SCAN | 1 | 1 | | 9042 | 1 |
filter | | | | | | |
TABLE access-t1 | INDEX ROWID | 3 | 1 | 15 | 22387 | 2 |
index-t1_idx#1 | FULL SCAN | 2 |97k| | 14643 | |
TABLE access-t6 | INDEX ROWID | 3 | 1 | 4 | 23169 | 3 |
index-t6_idx#0 | UNIQUE RANGE SCAN | 1 | 3 | | 7721 | 1 |
filter | | | | | | |
TABLE access-t6 | INDEX ROWID | 2 | 2 | 8 | 15363 | 2 |
index-t6_idx#0 | UNIQUE RANGE SCAN | 1 | 3 | | 7521 | 1 |
index-t4_idx#1 | RANGE SCAN | 3 | 1 | 28 | 21584 | 3 |
inlist iterator | | | | | | |
index-t5_idx#1 | RANGE SCAN | 4 | 1 | 24 | 42929 | 4 |
index-t3_idx#0 | INDEX UNIQUE SCAN | 0 | 1 | | 1900 | 0 |
TABLE access-t3 | INDEX ROWID | 1 | 1 | 30 | 9231 | 1 |
--------------------------------------------------------------------------------------------
Table Index
CREATE INDEX T1_IDX#1 ON T1 (BUS_ID, STATUS)

Oracle "Total" plan cost is really less than some of it's elements

I cannot figure out why sometimes, the total cost of a plan can be a very small number whereas looking inside the plan we can find huge costs. (indeed the query is very slow).
Can somebody explain me that?
Here is an example.
Apparently the costful part comes from a field in the main select that does a listagg on a subview and the join condition with this subview contains a complex condition (we can join on one field or another).
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 875 | 20 |
| 1 | SORT GROUP BY | | 1 | 544 | |
| 2 | VIEW | | 1 | 544 | 3 |
| 3 | SORT UNIQUE | | 1 | 481 | 3 |
| 4 | NESTED LOOPS | | | | |
| 5 | NESTED LOOPS | | 3 | 1443 | 2 |
| 6 | TABLE ACCESS BY INDEX ROWID | | 7 | 140 | 1 |
| 7 | INDEX RANGE SCAN | | 7 | | 1 |
| 8 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 9 | TABLE ACCESS BY INDEX ROWID | | 1 | 461 | 1 |
| 10 | SORT GROUP BY | | 1 | 182 | |
| 11 | NESTED LOOPS | | | | |
| 12 | NESTED LOOPS | | 8 | 1456 | 3 |
| 13 | NESTED LOOPS | | 8 | 304 | 2 |
| 14 | TABLE ACCESS BY INDEX ROWID | | 7 | 154 | 1 |
| 15 | INDEX RANGE SCAN | | 7 | | 1 |
| 16 | INDEX RANGE SCAN | | 1 | 16 | 1 |
| 17 | INDEX RANGE SCAN | | 1 | | 1 |
| 18 | TABLE ACCESS BY INDEX ROWID | | 1 | 144 | 1 |
| 19 | SORT GROUP BY | | 1 | 268 | |
| 20 | VIEW | | 1 | 268 | 9 |
| 21 | SORT UNIQUE | | 1 | 108 | 9 |
| 22 | CONCATENATION | | | | |
| 23 | NESTED LOOPS | | | | |
| 24 | NESTED LOOPS | | 1 | 108 | 4 |
| 25 | NESTED LOOPS | | 1 | 79 | 3 |
| 26 | NESTED LOOPS | | 1 | 59 | 2 |
| 27 | TABLE ACCESS BY INDEX ROWID | | 1 | 16 | 1 |
| 28 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 29 | TABLE ACCESS BY INDEX ROWID | | 1 | 43 | 1 |
| 30 | INDEX RANGE SCAN | | 1 | | 1 |
| 31 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 32 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 33 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 34 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 35 | NESTED LOOPS | | | | |
| 36 | NESTED LOOPS | | 1 | 108 | 4 |
| 37 | NESTED LOOPS | | 1 | 79 | 3 |
| 38 | NESTED LOOPS | | 1 | 59 | 2 |
| 39 | TABLE ACCESS BY INDEX ROWID | | 4 | 64 | 1 |
| 40 | INDEX RANGE SCAN | | 2 | | 1 |
| 41 | TABLE ACCESS BY INDEX ROWID | | 1 | 43 | 1 |
| 42 | INDEX RANGE SCAN | | 1 | | 1 |
| 43 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 44 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 45 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 46 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 47 | SORT GROUP BY | | 1 | 330 | |
| 48 | VIEW | | 1 | 330 | 26695 |
| 49 | SORT UNIQUE | | 1 | 130 | 26695 |
| 50 | CONCATENATION | | | | |
| 51 | HASH JOIN ANTI | | 1 | 130 | 13347 |
| 52 | NESTED LOOPS | | | | |
| 53 | NESTED LOOPS | | 1 | 110 | 4 |
| 54 | NESTED LOOPS | | 1 | 81 | 3 |
| 55 | NESTED LOOPS | | 1 | 61 | 2 |
| 56 | TABLE ACCESS BY INDEX ROWID | | 1 | 16 | 1 |
| 57 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 58 | TABLE ACCESS BY INDEX ROWID | | 1 | 45 | 1 |
| 59 | INDEX RANGE SCAN | | 1 | | 1 |
| 60 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 61 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 62 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 63 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 64 | VIEW | | 164K| 3220K| 13341 |
| 65 | NESTED LOOPS | | | | |
| 66 | NESTED LOOPS | | 164K| 11M| 13341 |
| 67 | NESTED LOOPS | | 164K| 8535K| 10041 |
| 68 | TABLE ACCESS BY INDEX ROWID | | 164K| 6924K| 8391 |
| 69 | INDEX SKIP SCAN | | 2131K| | 163 |
| 70 | INDEX UNIQUE SCAN | | 1 | 10 | 1 |
| 71 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 72 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 73 | HASH JOIN ANTI | | 2 | 260 | 13347 |
| 74 | NESTED LOOPS | | | | |
| 75 | NESTED LOOPS | | 2 | 220 | 4 |
| 76 | NESTED LOOPS | | 2 | 162 | 3 |
| 77 | NESTED LOOPS | | 2 | 122 | 2 |
| 78 | TABLE ACCESS BY INDEX ROWID | | 4 | 64 | 1 |
| 79 | INDEX RANGE SCAN | | 2 | | 1 |
| 80 | TABLE ACCESS BY INDEX ROWID | | 1 | 45 | 1 |
| 81 | INDEX RANGE SCAN | | 1 | | 1 |
| 82 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 83 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 84 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 85 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 86 | VIEW | | 164K| 3220K| 13341 |
| 87 | NESTED LOOPS | | | | |
| 88 | NESTED LOOPS | | 164K| 11M| 13341 |
| 89 | NESTED LOOPS | | 164K| 8535K| 10041 |
| 90 | TABLE ACCESS BY INDEX ROWID | | 164K| 6924K| 8391 |
| 91 | INDEX SKIP SCAN | | 2131K| | 163 |
| 92 | INDEX UNIQUE SCAN | | 1 | 10 | 1 |
| 93 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 94 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 95 | NESTED LOOPS OUTER | | 1 | 875 | 20 |
| 96 | NESTED LOOPS OUTER | | 1 | 846 | 19 |
| 97 | NESTED LOOPS OUTER | | 1 | 800 | 18 |
| 98 | NESTED LOOPS OUTER | | 1 | 776 | 17 |
| 99 | NESTED LOOPS OUTER | | 1 | 752 | 16 |
| 100 | NESTED LOOPS OUTER | | 1 | 641 | 15 |
| 101 | NESTED LOOPS OUTER | | 1 | 576 | 14 |
| 102 | NESTED LOOPS OUTER | | 1 | 554 | 13 |
| 103 | NESTED LOOPS OUTER | | 1 | 487 | 12 |
| 104 | NESTED LOOPS OUTER | | 1 | 434 | 11 |
| 105 | NESTED LOOPS | | 1 | 368 | 10 |
| 106 | NESTED LOOPS | | 1 | 102 | 9 |
| 107 | NESTED LOOPS OUTER | | 1 | 85 | 8 |
| 108 | NESTED LOOPS | | 1 | 68 | 7 |
| 109 | NESTED LOOPS | | 50 | 2700 | 6 |
| 110 | HASH JOIN | | 53 | 1696 | 5 |
| 111 | INLIST ITERATOR | | | | |
| 112 | TABLE ACCESS BY INDEX ROWID| | 520 | 10400 | 3 |
| 113 | INDEX RANGE SCAN | | 520 | | 1 |
| 114 | INLIST ITERATOR | | | | |
| 115 | TABLE ACCESS BY INDEX ROWID| | 91457 | 1071K| 1 |
| 116 | INDEX UNIQUE SCAN | | 2 | | 1 |
| 117 | TABLE ACCESS BY INDEX ROWID | | 1 | 22 | 1 |
| 118 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 119 | TABLE ACCESS BY INDEX ROWID | | 1 | 14 | 1 |
| 120 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 121 | TABLE ACCESS BY INDEX ROWID | | 1 | 17 | 1 |
| 122 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 123 | TABLE ACCESS BY INDEX ROWID | | 1 | 17 | 1 |
| 124 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 125 | TABLE ACCESS BY INDEX ROWID | | 1 | 266 | 1 |
| 126 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 127 | TABLE ACCESS BY INDEX ROWID | | 1 | 66 | 1 |
| 128 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 129 | TABLE ACCESS BY INDEX ROWID | | 1 | 53 | 1 |
| 130 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 131 | TABLE ACCESS BY INDEX ROWID | | 1 | 67 | 1 |
| 132 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 133 | INDEX RANGE SCAN | | 1 | 22 | 1 |
| 134 | TABLE ACCESS BY INDEX ROWID | | 1 | 65 | 1 |
| 135 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 136 | TABLE ACCESS BY INDEX ROWID | | 1 | 111 | 1 |
| 137 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 138 | TABLE ACCESS BY INDEX ROWID | | 1 | 24 | 1 |
| 139 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 140 | TABLE ACCESS BY INDEX ROWID | | 1 | 24 | 1 |
| 141 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 142 | TABLE ACCESS BY INDEX ROWID | | 1 | 46 | 1 |
| 143 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 144 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 145 | INDEX UNIQUE SCAN | | 1 | | 1 |
----------------------------------------------------------------------------------------------------------
The total cost of a statement is usually equal to or greater than the cost of any of its child operations. There are at least 4 exceptions to this rule.
Your plan looks like #3 but we can't be sure without looking at code.
1. FILTER
Execution plans may depend on conditions at run-time. These conditions cause FILTER operations that will dynamically decide which query block to execute. The example below uses a static condition but still demonstrates the concept. Part of the subquery is very expensive but the condition negates the whole thing.
explain plan for select * from dba_objects cross join dba_objects where 1 = 2;
select * from table(dbms_xplan.display(format => 'basic +cost'));
Plan hash value: 3258663795
--------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 0 (0)|
| 1 | FILTER | | |
| 2 | MERGE JOIN CARTESIAN | | 11M (3)|
...
2. COUNT STOPKEY
Execution plans sum child operations up until the final cost. But child operations will not always finish. In the example below it may be correct to say that part of the plan costs 214. But because of the condition where rownum <= 1 only part of that child operation may run.
explain plan for
select /*+ no_query_transformation */ *
from (select * from dba_objects join dba_objects using (owner))
where rownum <= 1;
select * from table(dbms_xplan.display(format => 'basic +cost'));
Plan hash value: 2132093199
-------------------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 (0)|
| 1 | COUNT STOPKEY | | |
| 2 | VIEW | | 4 (0)|
| 3 | VIEW | | 4 (0)|
| 4 | NESTED LOOPS | | 4 (0)|
| 5 | VIEW | DBA_OBJECTS | 2 (0)|
| 6 | UNION-ALL | | |
| 7 | HASH JOIN | | 3 (34)|
| 8 | INDEX FULL SCAN | I_USER2 | 1 (0)|
| 9 | VIEW | _CURRENT_EDITION_OBJ | 1 (0)|
| 10 | FILTER | | |
| 11 | HASH JOIN | | 214 (3)|
...
3. Subqueries in the SELECT column list
Cost aggregation does not include subqueries in the SELECT column list. A query like select ([expensive query]) from dual; will have a very small total cost. I don't understand the reason for this; Oracle estimates the subquery and he number of rows in the FROM, surely it could multiply them together for a total cost.
explain plan for
select dummy,(select count(*) from dba_objects cross join dba_objects) from dual;
select * from table(dbms_xplan.display(format => 'basic +cost'));
Plan hash value: 3705842531
---------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 (0)|
| 1 | SORT AGGREGATE | | |
| 2 | MERGE JOIN CARTESIAN | | 11M (3)|
...
4. Other - rounding? bugs?
About 0.01% of plans still have unexplainable cost issues. I can't find any pattern among them. Perhaps it's just a rounding issue or some rare optimizer bugs. There will always be some weird cases with a any model as complicated as the optimizer.
Check for more exceptions
This query can find other exceptions, it returns all plans where the first cost is less than the maximum cost.
select *
from
(
--First and Max cost per plan.
select
sql_id, plan_hash_value, id, cost
,max(cost) keep (dense_rank first order by id)
over (partition by sql_id, plan_hash_value) first_cost
,max(cost)
over (partition by sql_id, plan_hash_value) max_cost
,max(case when operation = 'COUNT' and options = 'STOPKEY' then 1 else 0 end)
over (partition by sql_id, plan_hash_value) has_count_stopkey
,max(case when operation = 'FILTER' and options is null then 1 else 0 end)
over (partition by sql_id, plan_hash_value) has_filter
,count(distinct(plan_hash_value))
over () total_plans
from v$sql_plan
--where sql_id = '61a161nm1ttjj'
order by 1,2,3
)
where first_cost < max_cost
--It's easy to exclude FILTER and COUNT STOPKEY.
and has_filter = 0
and has_count_stopkey = 0
order by 1,2,3;

mysql table unaligned in console output when using UTF8

I like to use mysql client. But when using UTF-8, the tables on the console are unaligned:
> set names utf8;
> [some query]
+--------+---------+---------------------------------+-----------------------------+----------+---------+-----------+-------+---------+-----------+
| RuleId | TaxonId | Note | NoteSci | MinCount | DayFrom | MonthFrom | DayTo | MonthTo | ExtraNote |
+--------+---------+---------------------------------+-----------------------------+----------+---------+-----------+-------+---------+-----------+
| 722 | 10090 | sedmihlásek malý | Hippolais caligata | 1 | 1 | 1 | 31 | 12 | NULL |
| 727 | 10059 | Anseranas semipalmata | husovec strakatý | 1 | 1 | 1 | 31 | 12 | NULL |
| 728 | 10062 | Cygnus atratus | labuť černá | 1 | 1 | 1 | 31 | 12 | NULL |
| 729 | 10094 | Anser cygnoides | husa labutí | 1 | 1 | 1 | 31 | 12 | NULL |
| 730 | 10063 | Tadorna cana | husice šedohlavá | 1 | 1 | 1 | 31 | 12 | NULL |
| 731 | 10031 | Cairina moschata f. domestica | pižmovka domácí | 20 | 1 | 1 | 31 | 12 | NULL |
| 732 | 10088 | Cairina scutulata | pižmovka bělokřídlá | 1 | 1 | 1 | 31 | 12 | NULL |
| 733 | 10087 | Anas sibilatrix | hvízdák chilský | 1 | 1 | 1 | 31 | 12 | NULL |
| 734 | 10077 | Anas platyrhynchos f. domestica | kachna domácí | 1000 | 1 | 1 | 31 | 12 | NULL |
| 735 | 10086 | Anas hottentota | čírka hottentotská | 1 | 1 | 1 | 31 | 12 | NULL |
|
This is apparently because mysql client will compute the widths of the columns using string length which doesn't take UTF-8 characters into account - so then there is exactly one space missing for each accented character (because these actually take two bytes).
Do you know possible workaround for this problem?
Run your mysql client with charset option:
mysql -uUSER -p DATABASE --default-character-set=utf8
(USER and DATABASE should be replaced with actual credentials data)

SphinxSE distinct empty result

I run this query in sphinx se console:
SELECT #distinct FROM all_ips GROUP BY ip1;
I get this result:
+------+--------+
| id | weight |
+------+--------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 9 | 1 |
| 15 | 1 |
| 16 | 1 |
| 17 | 1 |
| 20 | 1 |
| 21 | 1 |
| 25 | 1 |
| 26 | 1 |
| 27 | 1 |
| 31 | 1 |
| 32 | 1 |
| 38 | 1 |
| 39 | 1 |
| 40 | 1 |
| 46 | 1 |
| 50 | 1 |
| 51 | 1 |
+------+--------+
20 rows in set (0.57 sec)
How can i get number of unique values? Why #distinct column doesn't show up in results?
1) I dont think that is sphinxSE - do you really mean sphinxQL? That looks more like sphinxQL.
2) Distinct of what column? You need to sell sphinx what attribute you want to count the distinct values in. In sphinxQL use COUNT(DISTINCT column_name)
You will require simple SQL statement for getting count. Something like this
SELECT count(ip1),ip1
FROM all_ips
GROUP BY ip1;

Resources