I have a following query and it takes 12 hours to execute in HUE. I would like to increase the performance of the query. Let me know what changes I can implement in the query to increase the performance in HUE environment
SELECT ordernum,
Min(distance) mindist,
Min(CASE
WHEN type_name = 'T'
OR ( type_name = 'I'
AND item LIKE '%D%' ) THEN distance
ELSE 9999999
END) min_t,
Min(CASE
WHEN type_name = 'A' THEN distance
ELSE 9999999
END) min_a
FROM (SELECT a.ordernum,
b.id,
b.type_name,
b.item,
Round(Least(Sqrt(Pow(b.sty-a.nrthng, 2)
+ Pow(b.stx-a.estng, 2)),
Sqrt(Pow(b.endy-a.nrthng, 2)
+ Pow(b.endx-a.estng, 2))))
distance
FROM temp_b a,
min_b1 b
WHERE ( ( b.stx BETWEEN ( a.estng - 1000 ) AND ( a.estng + 1000 )
AND b.sty BETWEEN ( a.nrthng - 1000 ) AND
( a.nthing + 1000 ) )
OR ( b.endx BETWEEN ( a.estng - 1000 ) AND ( a.esng + 1000 )
AND b.endy BETWEEN ( a.nrthng - 1000 ) AND
( a.nrthng + 1000 ) ) )) a
GROUP BY ordernum
My concers are about your query join condition.
As I see, you have tables a and b. Are there any key fields so tables could be matched? I mean, field f1 from the table a has the same meaning as field f2 from table b so they could be joined.
You could also create temporary table containing information from both tables to remove overhead for network communication and data transfer as I believe your hadoop cluster contains more than single node.
Related
SELECT 1 + (SELECT count( * ) FROM student_rank a WHERE a.obtained_total_mark > b.obtained_total_mark ) AS rank FROM student_rank b WHERE student_id = '5' ORDER BY rank LIMIT 1 ;
convert search query
use
$this->db->query("SELECT 1 + (SELECT count( * ) FROM student_rank a WHERE a.obtained_total_mark > b.obtained_total_mark ) AS rank FROM student_rank b WHERE student_id = '5' ORDER BY rank LIMIT 1")
I am trying to create a CTE to use the same in Merge statement in Oracle, but facing error so please have a look and help me out, earlier we were using subquery instead CTE but to enhance the response time of query I am trying CTE so please suggest me another approach to enhance the response time of query. Curious to know whether oracle supports CTE and Merge statement together as I did in below code.
With TRANS_HIST
As
(Select
NUMERO_DE_CUENTA,
TRANS_DATETIME,
Lag(TRANS_DATETIME, 1)
over
(
ORDER BY NUMERO_DE_CUENTA,TRANS_DATETIME) lag_trans_datetime
FROM db_fraud_bpd.tbl_event_new_transaction_h
)
MERGE
INTO DB_FRAUD_BPD.TBL_RT_FEATURES_TEMP t1
USING
(
SELECT
RTTEMP.ACCOUNT_NUMBER,
RTTEMP.TRANS_DATETIME,
CASE WHEN STDDEV(TRANS_HIST.TRANS_DATETIME - TRANS_HIST.lag_trans_datetime) = 0 THEN NULL
WHEN ROUND(( ( (RTTEMP.TRANS_DATETIME - Max(TRANS_HIST.TRANS_DATETIME)) - Avg(TRANS_HIST.TRANS_DATETIME - TRANS_HIST.lag_trans_datetime)) / STDDEV(TRANS_HIST.TRANS_DATETIME -TRANS_HIST.lag_trans_datetime)),3) >999999999999999 THEN 999999999999999
WHEN ROUND(( ( (RTTEMP.TRANS_DATETIME - Max(TRANS_HIST.TRANS_DATETIME)) - Avg(TRANS_HIST.TRANS_DATETIME - TRANS_HIST.lag_trans_datetime)) / STDDEV(TRANS_HIST.TRANS_DATETIME -TRANS_HIST.lag_trans_datetime)),3) <-99999999999999 THEN -99999999999999
ELSE ROUND(( ( (RTTEMP.TRANS_DATETIME - Max(TRANS_HIST.TRANS_DATETIME)) - Avg(TRANS_HIST.TRANS_DATETIME - TRANS_HIST.lag_trans_datetime)) / STDDEV(TRANS_HIST.TRANS_DATETIME -TRANS_HIST.lag_trans_datetime)),3)
END AS TIME_DELTA_ZSCORE_PAST_90_DAYS
FROM TRANS_HIST
right outer join
db_fraud_bpd.tbl_rt_features_temp RTTEMP
ON Cast(RTTEMP.account_number AS INTEGER) = Cast(TRANS_HIST.NUMERO_DE_CUENTA AS INTEGER)
WHERE
(
TRANS_HIST.TRANS_DATETIME < RTTEMP.TRANS_DATETIME
AND TRANS_HIST.TRANS_DATETIME >= (RTTEMP.TRANS_DATETIME-90)
)
or TRANS_HIST.TRANS_DATETIME IS NULL
GROUP BY
RTTEMP.account_number,
RTTEMP.TRANS_DATETIME
)TEMP
ON (t1.TRANS_DATETIME=TEMP.TRANS_DATETIME AND t1.ACCOUNT_NUMBER=TEMP.ACCOUNT_NUMBER)
WHEN MATCHED
THEN UPDATE
SET t1.TIME_DELTA_ZSCORE_PAST_90_DAYS = TEMP.TIME_DELTA_ZSCORE_PAST_90_DAYS
You may use CTE in the merge statement, but in the rigth position in the merge subquery.
See example below
merge into tab
using (with t as (
select 1 id, 'x' x from dual union all
select 2 id, 'y' x from dual)
select * from t) b
on (tab.id = b.id)
when matched then
update set tab.x = b.x
when not matched then
insert (id, x)
values (b.id, b.x);
Don't use TRANS_HIST as a CTE, but as a subquery.
MERGE INTO DB_FRAUD_BPD.TBL_RT_FEATURES_TEMP t1
USING ( SELECT RTTEMP.ACCOUNT_NUMBER,
RTTEMP.TRANS_DATETIME,
CASE
WHEN STDDEV (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime) =
0
THEN
NULL
WHEN ROUND (
( ( ( RTTEMP.TRANS_DATETIME
- MAX (TRANS_HIST.TRANS_DATETIME))
- AVG (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime))
/ STDDEV (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime)),
3) >
999999999999999
THEN
999999999999999
WHEN ROUND (
( ( ( RTTEMP.TRANS_DATETIME
- MAX (TRANS_HIST.TRANS_DATETIME))
- AVG (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime))
/ STDDEV (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime)),
3) <
-99999999999999
THEN
-99999999999999
ELSE
ROUND (
( ( ( RTTEMP.TRANS_DATETIME
- MAX (TRANS_HIST.TRANS_DATETIME))
- AVG (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime))
/ STDDEV (
TRANS_HIST.TRANS_DATETIME
- TRANS_HIST.lag_trans_datetime)),
3)
END AS TIME_DELTA_ZSCORE_PAST_90_DAYS
FROM (SELECT NUMERO_DE_CUENTA,
TRANS_DATETIME,
LAG (TRANS_DATETIME, 1)
OVER (
ORDER BY NUMERO_DE_CUENTA, TRANS_DATETIME) lag_trans_datetime
FROM db_fraud_bpd.tbl_event_new_transaction_h)
TRANS_HIST
RIGHT OUTER JOIN db_fraud_bpd.tbl_rt_features_temp RTTEMP
ON CAST (RTTEMP.account_number AS INTEGER) =
CAST (TRANS_HIST.NUMERO_DE_CUENTA AS INTEGER)
WHERE ( TRANS_HIST.TRANS_DATETIME < RTTEMP.TRANS_DATETIME
AND TRANS_HIST.TRANS_DATETIME >=
(RTTEMP.TRANS_DATETIME - 90))
OR TRANS_HIST.TRANS_DATETIME IS NULL
GROUP BY RTTEMP.account_number, RTTEMP.TRANS_DATETIME) TEMP
ON ( t1.TRANS_DATETIME = TEMP.TRANS_DATETIME
AND t1.ACCOUNT_NUMBER = TEMP.ACCOUNT_NUMBER)
WHEN MATCHED
THEN
UPDATE SET
t1.TIME_DELTA_ZSCORE_PAST_90_DAYS = TEMP.TIME_DELTA_ZSCORE_PAST_90_DAYS
PROCEDURE DELETE_X1
IS
v_emp_unum VARCHAR2 (25);
BEGIN
/*FOR rec
IN (SELECT l2.*,
row_number ()
OVER (PARTITION BY case_uid,nvl(min_eff_date, eff_begin_date)
ORDER BY case_uid,nvl(min_eff_date, eff_begin_date))
rw,
ROWID rid
FROM (SELECT l1.*,
MIN (
CASE
WHEN eff_end_date = next_begin - 1
THEN
eff_begin_date
END)
OVER (PARTITION BY case_uid)
min_eff_date,
MAX (
CASE
WHEN previous_end <>
TO_DATE ('12/31/9999',
'mm/dd/yyyy')
AND previous_end + 1 = eff_begin_date
THEN
eff_end_date
END)
OVER (PARTITION BY case_uid)
max_end_date
FROM (SELECT GT.*,
LEAD (
EFF_begin_DATE)
OVER (PARTITION BY CASE_UID
ORDER BY EFF_BEGIN_DATE)
next_begin,
LAG (
EFF_end_DATE)
OVER (PARTITION BY CASE_UID
ORDER BY EFF_BEGIN_DATE)
previous_end
FROM TABLE_OUTPUT GT
WHERE SRC = 'ERICSSON' AND STATUS_CODE = 'X1') l1)
l2)*/
FOR rec
IN (SELECT l2.*,
ROW_NUMBER ()
OVER (PARTITION BY case_uid, min_eff_date, max_end_date
ORDER BY case_uid, min_eff_date, max_end_date)
rw,
ROWID rid
FROM (SELECT l1.*,
(MAX (
start_at)
OVER (
PARTITION BY case_uid
ORDER BY EFF_BEGIN_DATE
ROWS UNBOUNDED PRECEDING))
min_eff_date,
(MIN (
break_at)
OVER (
PARTITION BY case_uid
ORDER BY EFF_BEGIN_DATE
ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING))
max_end_date
FROM (SELECT GT.*,
(CASE
WHEN LAG (
EFF_end_DATE)
OVER (
PARTITION BY CASE_UID
ORDER BY EFF_BEGIN_DATE) =
EFF_BEGIN_DATE
- 1
THEN
NULL
ELSE
EFF_BEGIN_DATE
END)
start_at,
(CASE
WHEN LEAD (
EFF_BEGIN_DATE)
OVER (
PARTITION BY case_uid
ORDER BY EFF_BEGIN_DATE) =
CASE
WHEN EFF_end_DATE <>
TO_DATE (
'12/31/9999',
'mm/dd/yyyy')
THEN
EFF_end_DATE
+ 1
ELSE
EFF_end_DATE
END
THEN
NULL
ELSE
EFF_end_DATE
END)
break_at
FROM TABLE_OUTPUT GT
WHERE SRC = 'ERICSSON' AND STATUS_CODE = 'X1') l1)
l2)
The part of code is commented out an re written.
commented out code output
OFF TIME 1/1/2017 1/7/2017 X1
OFF TIME 1/8/2017 2/1/2017 X1
New code output
OFF TIME 1/1/2017 2/1/2017 X1
NORMAL 2/2/2017 2/2/2017 AB
OFF TIME 2/20/2017
The LAG function is used to access data from a previous row.
The LEAD function is used to return data from rows further down the result set.
aggregate functions MIN , MAX
i am pretty confused with the flow of code.
I can't understand that code on the whole please explain the logic for the code
This question already has answers here:
Create leading zero in Oracle
(2 answers)
Closed 6 years ago.
I am trying to update a column value. The column datatype is Number. As per the requirement, for the right records this column will be updated with 000. I have included this in the Else part of the condition but when the table is getting updated it's taking only 0 not 000. Please suggest. How can I make it 000?
MERGE INTO mem_src_extn t USING
(
SELECT mse.rowid row_id,
CASE WHEN mse.type_value IS NULL OR mse."TYPE" IS NULL OR mse.VALUE_1 IS NULL or mse.VALUE_2 IS NULL THEN 100
WHEN ( SELECT count(*) FROM cmc_mem_src cms WHERE cms.tn_id = mse.type_value ) = 0 THEN 222
WHEN count(mse.value_1) over ( partition by type_value ) > 1 THEN 333
ELSE 000 int_value_1 <-- here
FROM mem_src_extn mse
) u
ON ( t.rowid = u.row_id )
WHEN MATCHED THEN UPDATE SET t.int_value_1 = u.int_value_1
If your column int_value_1 is varchar2 use quote
MERGE INTO mem_src_extn t USING
(
SELECT mse.rowid row_id,
CASE WHEN mse.type_value IS NULL OR mse."TYPE" IS NULL OR mse.VALUE_1 IS NULL or mse.VALUE_2 IS NULL THEN '100'
WHEN ( SELECT count(*) FROM cmc_mem_src cms WHERE cms.tn_id = mse.type_value ) = 0 THEN '222'
WHEN count(mse.value_1) over ( partition by type_value ) > 1 THEN
'333'
ELSE '000' int_value_1
END
FROM mem_src_extn mse
) u
ON ( t.rowid = u.row_id )
WHEN MATCHED THEN UPDATE SET t.int_value_1 = u.int_value_1
but if you have a number as you say and want just to see 000 insead of 0
You may use to_char with format string
select to_char(int_value_1,'099') from mem_src_extn;
I have identified a way to get fast paged results from the database using CTEs and the Row_Number function, as follows...
DECLARE #PageSize INT = 1
DECLARE #PageNumber INT = 2
DECLARE #Customer TABLE (
ID INT IDENTITY(1, 1),
Name VARCHAR(10),
age INT,
employed BIT)
INSERT INTO #Customer
(name,age,employed)
SELECT 'bob',21,1
UNION ALL
SELECT 'fred',33,1
UNION ALL
SELECT 'joe',29,1
UNION ALL
SELECT 'sam',16,1
UNION ALL
SELECT 'arthur',17,0;
WITH cteCustomers
AS ( SELECT
id,
Row_Number( ) OVER(ORDER BY Age DESC) AS Row
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT
name,
age,
Total = ( SELECT
Count( id )
FROM cteCustomers )
FROM cteCustomers
INNER JOIN #Customer cust
/*This is where I choose the columns I want to read, it returns really fast!*/
ON cust.id = cteCustomers.id
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
Using this technique the returned results is really really fast even on complex joins and filters.
To perform paging I need to know the Total Rows returned by the full CTE. I have "Bodged" this by putting a column that holds it
Total = ( SELECT
Count( id )
FROM cteCustomers )
Is there a better way to return the total in a different result set without bodging it into a column? Because it's a CTE I can't seem to get it into a second result set.
Without using a temp table first, I'd use a CROSS JOIN to reduce the risk of row by row evaluation on the COUNT
To get total row, this needs to happen separately to the WHERE
WITH cteCustomers
AS ( SELECT
id,
Row_Number( ) OVER(ORDER BY Age DESC) AS Row
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT
name,
age,
Total
FROM cteCustomers
INNER JOIN #Customer cust
/*This is where I choose the columns I want to read, it returns really fast!*/
ON cust.id = cteCustomers.id
CROSS JOIN
(SELECT Count( *) AS Total FROM cteCustomers ) foo
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
However, this isn't guaranteed to give accurate results as demonstrated here:
can I get count() and rows from one sql query in sql server?
Edit: after a few comments.
How to avoid a CROSS JOIN
WITH cteCustomers
AS ( SELECT
id,
Row_Number( ) OVER(ORDER BY Age DESC) AS Row,
COUNT(*) OVER () AS Total --the magic for this edit
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT
name,
age,
Total
FROM cteCustomers
INNER JOIN #Customer cust
/*This is where I choose the columns I want to read, it returns really fast!*/
ON cust.id = cteCustomers.id
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
Note: YMMV for performance depending on 2005 or 2008, Service pack etc
Edit 2:
SQL Server Central shows another technique where you have reverse ROW_NUMBER. Looks useful
#Digiguru
OMG, this really is the wholy grail!
WITH cteCustomers
AS ( SELECT id,
Row_Number() OVER(ORDER BY Age DESC) AS Row,
Row_Number() OVER(ORDER BY id ASC)
+ Row_Number() OVER(ORDER BY id DESC) - 1 AS Total /*<- voodoo here*/
FROM #Customer
WHERE employed = 1
/*Imagine I've joined to loads more tables with a really complex where clause*/
)
SELECT name, age, Total
/*This is where I choose the columns I want to read, it returns really fast!*/
FROM cteCustomers
INNER JOIN #Customer cust
ON cust.id = cteCustomers.id
WHERE row BETWEEN ( #PageSize * #PageNumber - 1 ) AND ( #PageSize * ( #PageNumber ) )
ORDER BY row ASC
So obvious now.