Oracle SQL query with CASE WHEN EXISTS subquery optimization - oracle

I'm using the following query to create a view in Oracle 11g (11.2.0.3.0).
CREATE OR REPLACE FORCE VIEW V_DOCUMENTS_LIST
(
ID_DOC,
ATTACHMENTS_COUNT,
TOTAL_DIMENSION,
INSERT_DATE,
ID_STATE,
STATE,
ID_INSTITUTE,
INSTITUTE,
HASJOB
)
AS
SELECT D.ID_DOC,
COUNT (F.ID_FILE) AS ATTACHMENTS_COUNT,
CASE
WHEN SUM (F.DIMENSION) IS NULL THEN 0
ELSE SUM (F.DIMENSION)
END
AS TOTAL_DIMENSION,
D.INSERT_DATE,
D.ID_STATE,
S.STATE_DESC AS STATE,
D.ID_INSTITUTE,
E.NAME AS INSTITUTE,
CASE
WHEN EXISTS (SELECT D.ID_DOC FROM JOB) THEN 'true'
ELSE 'false'
END
AS HASJOB
FROM DOCUMENTS D
LEFT JOIN FILES F ON D.ID_DOC = F.ID_DOC
JOIN STATES S ON D.ID_STATE = S.ID_STATE
JOIN INSTITUTES E ON D.ID_INSTITUTE = E.ID_INSTITUTE
GROUP BY D.ID_DOC,
D.INSERT_DATE,
D.ID_STATE,
S.STATE_DESC,
D.ID_INSTITUTE,
E.NAME;
Then I query that view to get the values for a DataGridView in an ASPX page.
SELECT *
FROM V_DOCUMENTS_LIST
ORDER BY ID_STATE DESC, INSTITUTE, INSERT_DATE DESC;
Relevant tables and relations
DOCUMENTS; FILES; JOBS;
DOCUMENTS (1-1) <----> (0-N) FILES
JOBS (0-1) <----> (0-N) DOCUMENTS
Querying the view I get the complete list of documents with all their associated information (ID, description, dates, state, etc.) and also for each one:
total count of attached files;
total dimension in bytes of attached files;
boolean value indicating whether there's at least one JOB associated to the DOCUMENT or not.
Everything worked fine untile the view contained a few thousand records. Now the records amount is increasing and the SELECT * FROM on the view takes about 2:30 mins with 15.000-20.000 records.
I know that a really time consuming part of my view is the nested SELECT:
CASE
WHEN EXISTS (SELECT D.ID_DOC FROM JOB) THEN 'true'
ELSE 'false'
END
AS HASJOB
How can I optimize my view?

To address the not exists issue, you can add a join:
LEFT JOIN (select distinct id_doc from JOB) J
ON d.id_doc = J.id_doc
The Has_job column would be:
CASE
WHEN j.id_doc is not null THEN 'true'
ELSE 'false'
END AS HASJOB
PS: Your current implementation has a problem, as SELECT D.ID_DOC FROM JOB would allways contain rows if job table has rows. It is equivalent with select * from job, because exists just test existence of rows. A logically correct implementation would be: SELECT 1 FROM JOB j where j.id_doc = D.ID_DOC.

You are going full index on table JOB, put WHERE clause in the query:
SELECT D.ID_DOC FROM JOB

Related

Query to get Unique Indexes having NOT NULL columns - Oracle

Currently I am trying to find all the unique indexes defined in a table which are NOT NULL for Oracle database. What I mean by that is, Oracle allows creating unique indexes on columns which are even defined as NULL.
So if my table has two unique indexes, I want to retrieve the particular unique index which is having all the columns having the NOT NULL constraints.
I did come up with this query:
select ind.index_name, ind_col.column_name, ind.index_type, ind.uniqueness
from sys.dba_indexes ind
inner join sys.dba_ind_columns ind_col on ind.owner = ind_col.index_owner and ind.index_name = ind_col.index_name
where ind.owner in ('ISADRM') and ind.table_name in ('TH_RHELOR') and ind.uniqueness IN ('UNIQUE')
The above query is giving me all the unique indexes with the associated columns, but I am not sure, how should I join the above query with ALL_TAB_COLS which has the NULLABILITY data for all the columns of a table.
I tried joining this table with indexes and tried subquery as well, but not getting appropriate results.
Hence, would request you to please comment on same.
Analytic functions and inline views can help.
The analytic functions let you return detailed data but also create a summary on that data, based on separate windows. The detailed results include index owner, index name, and column name, but the counts are only per index owner and index name.
The first inline view joins the three tables, returns the detailed information, and has analytic functions to generate the count of all columns and the count of all nullable columns. The second inline view only selects rows where those two counts are equal.
--Unique indexes and columns where every column is NOT NULL.
select owner, index_name, column_name
from
(
--All relevant columns and counts of columns and not null columns.
select
dba_indexes.owner,
dba_indexes.index_name,
dba_tab_columns.column_name,
dba_tab_columns.nullable,
count(*) over (partition by dba_indexes.owner, dba_indexes.index_name) total_columns,
sum(case when nullable = 'N' then 1 else 0 end)
over (partition by dba_indexes.owner, dba_indexes.index_name) total_not_null_columns
from dba_indexes
join dba_ind_columns
on dba_indexes.owner = dba_ind_columns.index_owner
and dba_indexes.index_name = dba_ind_columns.index_name
join dba_tab_columns
on dba_ind_columns.table_name = dba_tab_columns.table_name
and dba_ind_columns.column_name = dba_tab_columns.column_name
where dba_indexes.owner = user
and dba_indexes.uniqueness = 'UNIQUE'
order by 1,2,3
)
where total_columns = total_not_null_columns
order by 1,2,3;
Analytic functions and inline views are tricky but they're very powerful once you learn how to use them.

How to check if a set of values exist in item table in Oracle

I have two table- 'Order' and 'Order Item'.
Order table contains-
Order Number, Order Date, etc.
Order Item table contains-
Order Number, Order Item Number, Product Name, etc.
The joining condition between these two tables is on Order Number.
In my target table I need orders and a flag. The flag should tell, if there is a predefined set of products which has been ordered as part of that order then it should be set to 'Yes'.
E.g., Suppose an order 'ORD-01' contains three products in Order Item table - 'Mobile', 'PC' and 'Tablet', then my resulting table should contain Order Number as ORD-01 and Flag as 'Yes'.
In the same way, if order 'ORD-02' contains only two prods 'Mobile' an 'Tablet', then the resulting table should contains 'ORD-02' and Flag 'No'.
Similarly, if order 'ORD-03' contains three different prods 'Notebook', 'PC' an 'Tablet', then the resulting table should contains 'ORD-03' and Flag 'No'.
As per my understanding, I have written below query-
SELECT order_number,(SELECT CASE WHEN COUNT(DISTINCT product_name)>=3
THEN 'Yes' ELSE 'No' END Prod_Flag
FROM order_item b
WHERE a.order_number=b.order_number
AND b.product_name IN ('Mobile','PC','Tablet'))
FROM order a
WHERE order_date>last_run_date;
But it takes too much of time, as the order item is a very big table (>1 Billion rows). However I need incremental data based upon order date from Order table. Even if there is an index of order number in both tables, it takes time.
Would a query like this get you to your result any quicker?
SELECT ON.ORDER_NUMBER,
CASE WHEN SET_FOUND.ORDER_NUMBER IS NOT NULL
THEN 'Yes' ELSE 'No' END PROD_FLAG
FROM ORDER ON,
(SELECT ORDER_NUMBER
FROM ORDER_ITEM
WHERE PRODUCT_NAME = 'Mobile'
INTERSECT
SELECT ORDER_NUMBER
FROM ORDER_ITEM
WHERE PRODUCT_NAME = 'PC'
INTERSECT
SELECT ORDER_NUMBER
FROM ORDER_ITEM
WHERE PRODUCT_NAME = 'Tablet') SET_FOUND
WHERE ON.ORDER_NUMBER = SET_FOUND.ORDER_NUMBER (+)
My proposal would be this one:
WITH t AS
(SELECT product_name, order_number
FROM order_item
WHERE product_name IN ('Mobile','PC','Tablet')
GROUP BY order_number, product_name)
SELECT order_number,
CASE WHEN COUNT(DISTINCT product_name) >= 3 THEN 'Yes' ELSE 'No' END
FROM t
JOIN order USING (order_number)
GROUP BY order_number
Is the order number an increasing sequence number? If so the one approach would be to limit data selected from the order_item, which you said is a large table, by putting condition on order_number, which you said is an indexed column. I assume last_run_date signifficantly limits number of concerned orders.
If so you can:
select min(order_number) into order_num_from from Order where order_date>last_run_date
and then make your query
SELECT order_number,(SELECT CASE WHEN COUNT(DISTINCT product_name)>=3
THEN 'Yes' ELSE 'No' END Prod_Flag
FROM order_item b
WHERE a.order_number=b.order_number
AND b.order_number> order_num_from
AND b.product_name IN ('Mobile','PC','Tablet'))
FROM order a
WHERE order_date>last_run_date;
If this runs significantly faster (I didn't see explain plan, so this is just an idea how to avoid full table scan ), put an index on order_date column and eventually make finding order_num_from into subquery to have one single query.
Generally, your query is right. As I understood, you wish to raise it's speed. If so, there are several ways you can try.
You can consider to put these tables into indexed cluster. It will store the data physically joined so querying would require less physical reads.
For this query, server should scan two tables: one for appropriate dates (eigther full table scan or index scan), other for products and joins the results by reading ORDER_NUMBER via rowid. It isn't very fast anyway. The simpliest way is to add (ORDER_DATE, ORDER_NUMBER) index for ORDERs and (ORDER_NUMBER, PRODUCT_NAME) index for ORDER_ITEMs; it will allow to use indexes only.
Maybe it would be suitable to make a fast-refreshable materialized view, something like
create materialized view as
select
a.order_date,
a.order_number,
sum(case when b.product_name = 'Mobile' then 1 else 0 end) cnt_mobiles,
sum(case when b.product_name = 'PC' then 1 else 0 end) cnt_pcs,
sum(case when b.product_name = 'Tablet' then 1 else 0 end) cnt_tablets
from
order a, order_item b
where
a.order_number = b.order_number
group by
a.order_number, a.order_date
If it would be impossible to make this fast-refreshable, you can do equal thing manually using trigger. Anyway, in this case you'll get precalculated data ready to check.

Need to select column from subquery into main query

I have a query like below - table names etc. changed for keeping the actual data private
SELECT inv.*,TRUNC(sysdate)
FROM Invoice inv
WHERE (inv.carrier,inv.pro,inv.ndate) IN
(
SELECT carrier,pro,n_dt FROM Order where TRUNC(Order.cr_dt) = TRUNC(sysdate)
)
I am selecting records from Invoice based on Order. i.e. all records from Invoice which are common with order records for today, based on those 3 columns...
Now I want to select Order_Num from Order in my select query as well.. so that I can use the whole thing to insert it into totally seperate table, let's say orderedInvoices.
insert into orderedInvoices(seq_no,..same columns as Inv...,Cr_dt)
(
SELECT **Order.Order_Num**, inv.*,TRUNC(sysdate)
FROM Invoice inv
WHERE (inv.carrier,inv.pro,inv.ndate) IN
(
SELECT carrier,pro,n_dt FROM Order where TRUNC(Order.cr_dt) = TRUNC(sysdate)
)
)
?? - how to do I select that Order_Num in main query for each records of that sub query?
p.s. I understand that trunc(cr_dt) will not use index on cr_dt (if a index is there..) but I couldn't select records unless I omit the time part of it..:(
If the table ORDER1 is unique on CARRIER, PRO and N_DT you can use a JOIN instead of IN to restrict your records, it'll also enable you to select whatever data you want from either table:
select order.order_num, inv.*, trunc(sysdate)
from Invoice inv
join order ord
on inv.carrier = ord.carrier
and inv.pro = ord.pro
and inv.ndate = ord.n_dt
where trunc(order.cr_dt) = trunc(sysdate)
If it's not unique then you have to use DISTINCT to deduplicate your record set.
Though using TRUNC() on CR_DT will not use an index on that column you can use a functional index on this if you do need an index.
create index i_order_trunc_cr_dt on order (trunc(cr_dt));
1. This is a really bad name for a table as it's a keyword, consider using ORDERS instead.

NOT IN query... odd results

I need a list of users in one database that are not listed as the new_user_id in another. There are 112,815 matching users in both databases; user_id is the key in all queries tables.
Query #1 works, and gives me 111,327 users who are NOT referenced as a new_user_Id. But it requires querying the same data twice.
-- 111,327 GSU users are NOT listed as a CSS new user
-- 1,488 GSU users ARE listed as a new user in CSS
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (select cud.new_user_id
from css.user_desc cud
where cud.new_user_id is not null);
Query #2 would be perfect... and I'm actually surprised that it's syntactically accepted. But it gives me a result that makes no sense.
-- This gives me 1,505 users... I've checked, and they are not
-- referenced as new_user_ids in CSS, but I don't know why the ones
-- that were excluded were excluded.
--
-- Where are the missing 109,822, and whatexcluded them?
--
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id not in (cudsubq.new_user_id);
What exactly is the where clause in the second query doing, and why is it excluding 109,822 records from the results?
Note The above query is a simplification of what I'm really after. There are other/better ways to do the above queries... they're just representative of the part of the query that's giving me problems.
Read this: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:442029737684
For what I understand, your cudsubq.new_user_id can be NULL even though both tables are joined by user_id, so, you won't get results using the NOT IN operator when the subset contains NULL values . Consider the example in the article:
select * from dual where dummy not in ( NULL )
This returns no records. Try using the NOT EXISTS operator or just another kind of join. Here is a good source: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
And what you need is the fourth example:
SELECT COUNT(descr.user_id)
FROM
user_profile prof
LEFT OUTER JOIN user_desc descr
ON prof.user_id = descr.user_id
WHERE descr.new_user_id IS NULL
OR descr.new_user_id != prof.user_id
Second query is semantically different. In this case
where gup.user_id not in (cudsubq.new_user_id)
cudsubq.new_user_id is treated as expression (doc: IN condition), not as a subquery, thus the whole clause is basically equivalent to
where gup.user_id != cudsubq.new_user_id
So, in your first query, you're literally asking "show me all users in GUP, who also have entries in CSS and their GUP.ID is not matching ANY NOT NULL NEW_ID in CSS ".
However, the second query is "show me all users in GUP, who also have entries in CSS and their GUP.ID is not equal to their RESPECTIVE NULLABLE (no is not null clause, remember?) CSS.NEW_ID value".
And any (not) in (or equality/inequality) checks with nulls don't actually work.
12:07:54 SYSTEM#oars_sandbox> select * from dual where 1 not in (null, 2, 3, 4);
no rows selected
Elapsed: 00:00:00.00
This is where you lose your rows. I would probably rewrite your second query's where clause as
where cudsubq.new_user_id is null, assuming that non-matching users have null new_user_id.
Your second select compares gup.user_id with cud.new_user_id on current joining record. You can rewrite the query to get the same result
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id, cud.new_user_id, cud.user_type_code
from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id
where gup.user_id != cud.new_user_id or cud.new_user_id is null;
You mentioned you compare list of user in one database with a list of users in another. So you need to query data twice and you don't query the same data. Maybe you can use "minus" operator to avoid using "in"
select count(gup.user_id)
from gsu.user_profile gup
join (select cud.user_id from css.user_desc cud
minus
select cud.new_user_id from css.user_desc cud) cudsubq
on gup.user_id = cudsubq.user_id;
You want new_user_id's from table gup that don't match any new_user_id on table cud, right? It sounds like a job for a left join:
SELECT count(gup.user_id)
FROM gsu.user_profile gup LEFT JOIN css.user_desc cud
ON gup.user_id = cud.new_user_id
WHERE cud.new_user_id is NULL
The join keeps all rows of gup, matching them with a new_user_id if possible. The WHERE condition keeps only the rows that have no matching row in cud.
(Apologies if you know this already and you're only interested in the behavior of the not in query)

Oracle: MIN() Statement causes empty row returns

I'm having a small issue with sorting the data returned from a query, with the aim of getting the oldest updated value in dataset so that I can update only that record. Here's what I'm doing:
WHERE ROWNUM = 1 AND TABLE1.ID != V_IGNOREID
AND TABLE1.LASTREADTIME = (SELECT MIN(TABLE1.LASTREADTIME) FROM TABLE1)
ORDER BY TABLE1.LASTREADTIME DESC;
It makes no difference as to whether the ORDER BY statement is included or not. If I only use the ROWNUM and equality checks, I get data, but it alternates between only two rows, which is why I'm trying to use the LASTREADTIME data (so that I can modify more than these two rows). Anybody have any thoughts on this, or any suggestions as to how I can use the MIN function effectively?
Cheers
select * from (
-- your original select without rownum and with order by
)
WHERE ROWNUM = 1
EDIT some explanation
I think the order by clause is applied on the resultset after the where clause. So if the rownum = 1 is in the same select statement with the order by, then it will be applied first and the order by will order only 1 row, which will be the first row of the unordered resultset.

Resources