Tune oracle query with groupby clause - oracle

I have a table with Lots of cost columns for each Key
TableA
SK1 SK2 Col1 Col2 Col3..... Col50 Flg(Y/N)
1 2 10 20 30 ...... 500 Y
1 2 10 20 30 ...... 500 N
2 2 10 20 30 ...... 500 N
I need to aggregate(sum) of all values and then check if there are any values with Y then add them to new tableB.
Here table A record combination (1,2) for (sk1,sk2) should be returned.
The i have written query is to select lisr of all cols and add as group by.
We have lots of data so this query is taking too long to run. Any chance to relook into this and do so that it can become faster.
select
Sk1,
Sk2,
nvl(sum(col3),0),
nvl(sum(col4))0,
.....
nvl(sum(col50))
from table A
group by Sk1,
Sk2
Iam using this as part of large query where in many other calculations are performed on top of this.

Working out whether any of a grouped set of records contains a 'Y' would be as simple as ...
select ...
from ...
group by ...
having max(flg) = 'Y'

For now i have created a temporary table and have loaded all the data into it.

If you are using this as part of large query, did you try WITH option?
It could be like this
WITH SUM_DATA AS (select col1, col2, nvl(sum(col3),0), nvl(sum(col4))0, ..... nvl(sum(col50)) from table A group by col1, col2)
SELECT xyz
FROM abc, sum_data
WHERE abc.join_col = sum_data.join_col
More help here

Related

How to write correct left Join of two tables?

I want to join two tables, first table primary key data type is number, and second table primary key data type is VARCHAR2(30 BYTE). How to join both tables.
I tried this code but second tables all values are null. why is that?
SELECT a.act_phone_no,a.act_actdevice,a.bi_account_id, a.packag_start_date, c.identification_number,
FROM ACTIVATIONS_POP a
left JOIN customer c
on TO_CHAR(a.act_phone_no) = c.msisdn_voice
first table
act_phone_no bi_account_id
23434 45345
34245 43556
Second table
msisdn_voice identification_number
23434 321113
34245 6547657
It seems that you didn't tell us everything. Query works, if correctly written, on such a sample data:
SQL> with
2 -- Sample data
3 activations_pop (act_phone_no, bi_account_id) as
4 (select 23434, 45345 from dual union all
5 select 34245, 43556 from dual
6 ),
7 customer (msisdn_voice, identification_number) as
8 (select '23434', 321113 from dual union all
9 select '34245', 6547657 from dual
10 )
11 -- query works OK
12 select a.act_phone_no,
13 a.bi_account_id,
14 c.identification_number
15 from activations_pop a join customer c on to_char(a.act_phone_no) = c.msisdn_voice;
ACT_PHONE_NO BI_ACCOUNT_ID IDENTIFICATION_NUMBER
------------ ------------- ---------------------
23434 45345 321113
34245 43556 6547657
SQL>
What could be wrong? Who knows. If you got some result but columns from the CUSTOMER table are empty (NULL?), then they really might be NULL, or you didn't manage to join rows on those columns (left/right padding with spaces?). Does joining on e.g.
on to_char(a.act_phone_no) = trim(c.msisdn_voice)
or
on a.act_phone_no = to_number(c.msisdn_voice)
help?
Consider posting proper test case (CREATE TABLE and INSERT INTO statements).
You are using Oracle ?
Please check the below demo
SELECT a.act_phone_no, a.bi_account_id, c.identification_number
FROM ACTIVATIONS_POP a
left JOIN customer c
on TO_CHAR(a.act_phone_no) = c.msisdn_voice;
SQLFiddle

delete unique rows from table having minimum value using joins

select a.rowid,a.ta_transaction_at_k,a.ta_approvalid_k
from OF_TATRANSACTIONAPPROVALS a,OF_ATAVAILMENTTICKETS b
where a.TA_TRANSACTION_AT_K=b.at_transaction_k
and a.TA_APPROVALROLE_RO = 98
and a.TA_APPROVALTYPE = 'TA'
and b.AT_GROUP_ID=402
I have this query which gives result in below format. How can I delete records from it
331789 3
331789 4
331789 5
331787 3
331787 4
331787 5
I want to delete ids with minimum values
got the issue resolved using row ids.
delete from OF_TATRANSACTIONAPPROVALS where rowid in(
select rowsid from(
select a.rowid rowsid,a.ta_transaction_at_k trns,a.ta_approvalid_k,row_number() over (partition by TA_TRANSACTION_AT_K order by TA_TRANSACTION_AT_K,ta_approvalid_k) rn
from OF_TATRANSACTIONAPPROVALS a,OF_ATAVAILMENTTICKETS b where a.TA_TRANSACTION_AT_K=b.at_transaction_k
and a.TA_APPROVALROLE_RO = 98 and a.TA_APPROVALTYPE = 'TA' and b.AT_GROUP_ID=402) where rn=1)

How to set range for limit clause in hive

How to set range for limit clause in hive , I have tried the below query but failed with syntax error . Can someone please help
select * from table limit 1000,2000;
You can use Row_Number window function and set the range limit.
Below Query will result only the first 20 records from the table
hive> select * from
(
SELECT *,ROW_NUMBER() over (Order by id) as rowid FROM <tab_name>
)t
where rowid > 0 and rowid <=20;
Using Between operator to specify range
hive> select * from
(
SELECT *,ROW_NUMBER() over (Order by id) as rowid FROM <tab_name>
)t
where rowid between 0 and 20;
To fetch rows from 20 to 40 then increase the lower/upper bound values
hive> select * from
(
SELECT *,ROW_NUMBER() over (Order by id) as rowid FROM <tab_name>
)t
where rowid > 20 and rowid <=40;
The LIMIT clause is used to set a ceiling on the number of rows in the result set. You are getting a syntax error because of an incorrect usage of this HQL clause.
The query could be written as the following to return no more than 2000 rows:
SELECT * FROM table LIMIT 2000;
You could also write it like so to return no more than 1000 rows:
SELECT * FROM table LIMIT 1000;
However you cannot combine both into the same argument for LIMIT. The LIMIT argument must evaluate to a constant value.
I will try and expand on this information a bit to try and help solve your problem. If you are attempting to "paginate" your results the following may be of use.
FIRST I would recommend against leaning on HQL for pagination, in most situations that would be more efficiently implemented on the application logic side (query large result set, cache what you need, paginate with application logic). If you have no choice but to pull out ranges of rows you can get the desired effect through a combination of the LIMIT, ORDER BY, and OFFSET clauses.
LIMIT : This will limit your result set to a maximum number of row
ORDER BY: This will sort/order your result set based on one or more columns
OFFSET: This will start your result set at a certain row after the logical first entry in the table.
You may combine these three clauses to effectively query "pages" of your table. For example the following three queries show how to get the first 3 blocks of data from a table where each block contains 1000 rows and the target table's 'column1' is used to determine logical order.
SELECT title as "Page 1", column1, column2, ... FROM table
ORDER BY column1 LIMIT 1000 OFFSET 0;
SELECT title as "Page 2", column1, column2, ... FROM table
ORDER BY column1 LIMIT 1000 OFFSET 1000;
SELECT title as "Page 3", column1, column2, ... FROM table
ORDER BY column1 LIMIT 1000 OFFSET 2000;
Each query declares 'column1' as the sorting value with ORDER BY. The queries will return no more than 1000 rows due to the LIMIT clause. Each result set will start at a different row due to the OFFSET being incremented by the "page size" for each query.
I am not sure what you are trying to achieve, but ...
That will return the 1001 and the 2001 record in the query results set only if you are using hive a hive version greater than 2.0.0
hive --version
(https://issues.apache.org/jira/browse/HIVE-11531)
Limit in Hive gives 'n' number of records randomly. It's not to print a range of records.
You may use order by in conjunction with limit to get what you want

update rows from multiple tables

I have two tables affiliation and customer, in that i have data like this
aff_id From_cus_id
------ -----------
1 10
2 20
3 30
4 40
5 50
cust_id cust_aff_id
------- -------
10
20
30
40
50
i need to update data for cust_aff_id column from affiliation table which is aff_id like below
cust_id cust_aff_id
------- -------
10 1
20 2
30 3
40 4
50 5
could u please give reply if anyone knows......
Oracle doesn't have an UPDATE with join syntax, but you can use a subquery instead:
UPDATE customer
SET customer.cust_aff_id =
(SELECT aff_id FROM affiliation WHERE From_cus_id = customer.cust_id)
merge into customer t2
using affiliation t1 on (t1.From_cus_id =t2.cust_id )
WHEN MATCHED THEN
update set t2.cust_aff_id = t1.aff_id
;
Here is an update with join syntax. This, quite reasonably, works only if from_cus_id is primary key in the first table and cust_id is foreign key in the second table, referencing the first table. Without these conditions, the requirement doesn't make much sense in the first place anyway... but Oracle requires that these constraints be stated explicitly in the tables. This is also reasonable on Oracle's part IMO.
update
( select t1.aff_id, t2.cust_aff_id
from affiliation t1 join customer t2 on t2.cust_id = t1.from_cus_id) j
set j.cust_aff_id = j.aff_id;

How to get records randomly from the oracle database?

I need to select rows randomly from an Oracle DB.
Ex: Assume a table with 100 rows, how I can randomly return 20 of those records from the entire 100 rows.
SELECT *
FROM (
SELECT *
FROM table
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum < 21;
SAMPLE() is not guaranteed to give you exactly 20 rows, but might be suitable (and may perform significantly better than a full query + sort-by-random for large tables):
SELECT *
FROM table SAMPLE(20);
Note: the 20 here is an approximate percentage, not the number of rows desired. In this case, since you have 100 rows, to get approximately 20 rows you ask for a 20% sample.
SELECT * FROM table SAMPLE(10) WHERE ROWNUM <= 20;
This is more efficient as it doesn't need to sort the Table.
SELECT column FROM
( SELECT column, dbms_random.value FROM table ORDER BY 2 )
where rownum <= 20;
In summary, two ways were introduced
1) using order by DBMS_RANDOM.VALUE clause
2) using sample([%]) function
The first way has advantage in 'CORRECTNESS' which means you will never fail get result if it actually exists, while in the second way you may get no result even though it has cases satisfying the query condition since information is reduced during sampling.
The second way has advantage in 'EFFICIENT' which mean you will get result faster and give light load to your database.
I was given an warning from DBA that my query using the first way gives loads to the database
You can choose one of two ways according to your interest!
In case of huge tables standard way with sorting by dbms_random.value is not effective because you need to scan whole table and dbms_random.value is pretty slow function and requires context switches. For such cases, there are 3 additional methods:
1: Use sample clause:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
for example:
select *
from s1 sample block(1)
order by dbms_random.value
fetch first 1 rows only
ie get 1% of all blocks, then sort them randomly and return just 1 row.
2: if you have an index/primary key on the column with normal distribution, you can get min and max values, get random value in this range and get first row with a value greater or equal than that randomly generated value.
Example:
--big table with 1 mln rows with primary key on ID with normal distribution:
Create table s1(id primary key,padding) as
select level, rpad('x',100,'x')
from dual
connect by level<=1e6;
select *
from s1
where id>=(select
dbms_random.value(
(select min(id) from s1),
(select max(id) from s1)
)
from dual)
order by id
fetch first 1 rows only;
3: get random table block, generate rowid and get row from the table by this rowid:
select *
from s1
where rowid = (
select
DBMS_ROWID.ROWID_CREATE (
1,
objd,
file#,
block#,
1)
from
(
select/*+ rule */ file#,block#,objd
from v$bh b
where b.objd in (select o.data_object_id from user_objects o where object_name='S1' /* table_name */)
order by dbms_random.value
fetch first 1 rows only
)
);
To randomly select 20 rows I think you'd be better off selecting the lot of them randomly ordered and selecting the first 20 of that set.
Something like:
Select *
from (select *
from table
order by dbms_random.value) -- you can also use DBMS_RANDOM.RANDOM
where rownum < 21;
Best used for small tables to avoid selecting large chunks of data only to discard most of it.
Here's how to pick a random sample out of each group:
SELECT GROUPING_COLUMN,
MIN (COLUMN_NAME) KEEP (DENSE_RANK FIRST ORDER BY DBMS_RANDOM.VALUE)
AS RANDOM_SAMPLE
FROM TABLE_NAME
GROUP BY GROUPING_COLUMN
ORDER BY GROUPING_COLUMN;
I'm not sure how efficient it is, but if you have a lot of categories and sub-categories, this seems to do the job nicely.
-- Q. How to find Random 50% records from table ?
when we want percent wise randomly data
SELECT *
FROM (
SELECT *
FROM table_name
ORDER BY DBMS_RANDOM.RANDOM)
WHERE rownum <= (select count(*) from table_name) * 50/100;

Resources