Oracle: can I avoid a lot of JOINs? - oracle

I have the following tables in Oracle 12c:
objects (about 600 K lines)
attributes (about 10 M lines)
I need to transpose objects' attributes into another table. The idea is shown in the picture below.
To do so now I use a lot of "JOINs" like:
SELECT
o.id,
o.name,
a1.value as attr_1,
a2.value as attr_2,
a3.value as attr_3,
a4.value as attr_4,
a5.value as attr_5
FROM objects o
LEFT JOIN attributes a1 ON a1.obj_id = o.id AND a1.attr_id = 1
LEFT JOIN attributes a2 ON a2.obj_id = o.id AND a2.attr_id = 2
LEFT JOIN attributes a3 ON a3.obj_id = o.id AND a3.attr_id = 3
LEFT JOIN attributes a4 ON a4.obj_id = o.id AND a4.attr_id = 4
LEFT JOIN attributes a5 ON a5.obj_id = o.id AND a5.attr_id = 5
I have some queries with up to 20 attributes so I have to JOIN "10 M lines"-table 20 times.
It works but it takes a lot of time.
Do you have any good ideas how to organize it better?

One possibility, as mentioned in comments, is to use the PIVOT option on the SELECT. I am not clear what your concern is about this -- are you just saying you want to determine the actual resulting column names yourself? You can easily alias the columns after the PIVOT operation.
Before PIVOT existed, the standard method I used and saw others use to accomplish it was like this:
SELECT obj_id,
MAX(CASE WHEN attr_id = 1 THEN value ELSE NULL) AS attr_1,
MAX(CASE WHEN attr_id = 2 THEN value ELSE NULL) aS attr_2,
... etc. ...
FROM attributes
GROUP BY obj_id
For the full query, you could put this subquery in a CTE and join that with objects.
Note this doesn't necessarily mean that Oracle will execute the entire subquery before doing the join. In your case it might, since I assume every row in objects has corresponding rows in attributes. But if you had a filter on obj_id in the query, it might filter on that first then do the grouping. In any case, I'd certainly expect this to be more efficient than joining many times.

You can use level and connect by to generate the SQL like below.But if you give the exact SQL skeleton we can make use of user_tab_Cols to generate the SQL.
SELECT 'select o.id,o.name, 'sql_text
FROM dual
UNION ALL
SELECT 'a.'
||LEVEL
||'.values as attr_'
||LEVEL
||',' sql_text
FROM dual
CONNECT BY LEVEL <= 5
UNION ALL
SELECT 'FROM objects o '
FROM dual
UNION ALL
SELECT 'LEFT JOIN attributes a'
||LEVEL
||' ON a'
||LEVEL
||'.obj_id = o.id AND a'
||LEVEL
||'.attr_id = '
||LEVEL
FROM dual
CONNECT BY LEVEL <= 5

Related

how to select specific columns from three different tables in Oracle SQL

I am trying to select values from three different tables.
When I select all columns it works well, but if I select specific column, the SQL Error [42000]: JDBC-8027:Column name is ambiguous. appear.
this is the query that selected all that works well
SELECT
*
FROM (SELECT x.*, B.*,C.* , COUNT(*) OVER (PARTITION BY x.POLICY_NO) policy_no_count
FROM YIP.YOUTH_POLICY x
LEFT JOIN
YIP.YOUTH_POLICY_AREA B
ON x.POLICY_NO = B.POLICY_NO
LEFT JOIN
YIP.YOUTH_SMALL_CATEGORY C
ON B.SMALL_CATEGORY_SID = C.SMALL_CATEGORY_SID
ORDER BY x.POLICY_NO);
and this is the error query
SELECT DISTINCT
x.POLICY_NO,
x.POLICY_TITLE,
policy_no_count ,
B.SMALL_CATEGORY_SID,
C.SMALL_CATEGORY_TITLE
FROM (SELECT x.*, B.*,C.* , COUNT(*) OVER (PARTITION BY x.POLICY_NO) policy_no_count
FROM YIP.YOUTH_POLICY x
LEFT JOIN
YIP.YOUTH_POLICY_AREA B
ON x.POLICY_NO = B.POLICY_NO
LEFT JOIN
YIP.YOUTH_SMALL_CATEGORY C
ON B.SMALL_CATEGORY_SID = C.SMALL_CATEGORY_SID
ORDER BY x.POLICY_NO);
I am trying to select if A.POLICY_NO values duplicate rows more than 18, want to change C.SMALL_CATEGORY_TITLE values to "ZZ" and also want to cahge B.SMALL_CATEGORY_SID values to null.
that is why make 2 select in query like this
SELECT DISTINCT
x.POLICY_NO,
CASE WHEN (policy_no_count > 17) THEN 'ZZ' ELSE C.SMALL_CATEGORY_TITLE END AS C.SMALL_CATEGORY_TITLE,
CASE WHEN (policy_no_count > 17) THEN NULL ELSE B.SMALL_CATEGORY_SID END AS B.SMALL_CATEGORY_SID,
x.POLICY_TITLE
FROM (SELECT x.*, B.*,C.* , COUNT(*) OVER (PARTITION BY x.POLICY_NO) policy_no_count
FROM YIP.YOUTH_POLICY x
LEFT JOIN
YIP.YOUTH_POLICY_AREA B
ON x.POLICY_NO = B.POLICY_NO
LEFT JOIN
YIP.YOUTH_SMALL_CATEGORY C
ON B.SMALL_CATEGORY_SID = C.SMALL_CATEGORY_SID
ORDER BY x.POLICY_NO);
If i use that query, I got SQL Error [42000]: JDBC-8006:Missing FROM keyword. ¶at line 3, column 80 of null error..
I know I should solve it step by step. Is there any way to select specific columns?
That's most probably because of SELECT x.*, B.*,C.* - avoid asterisks - explicitly name all columns you need, and then pay attention to possible duplicate column names; if you have them, use column aliases.
For example, if that select (which is in a subquery) evaluates to
select x.id, x.name, b.id, b.name
then outer query doesn't know which id you want as two columns are named id (and also two names), so you'd have to
select x.id as x_id,
x.name as x_name,
b.id as b_id,
b.name as b_name
from ...
and - in outer query - select not just id, but e.g. x_id.

select statement should return count as zero if no row return using group by clause

I have a table student_info, it has column "status", status can be P (present), A (absent), S (ill), T ( transfer), L (left).
I am looking for expected output as below.
status count(*)
P 12
S 1
A 2
T 0
L 0
But output is coming like as below:
Status Count(*)
P 12
S 1
A 2
we need rows against status T and L as well with count zero though no record exist in DB.
#mkuligowski's approach is close, but you need an outer join between the CTE providing all of the possible status values, and then you need to count the entries that actually match:
-- CTE to generate all possible status values
with stored_statuses (status) as (
select 'A' from dual
union all select 'L' from dual
union all select 'P' from dual
union all select 'S' from dual
union all select 'T' from dual
)
select ss.status, count(si.status)
from stored_statuses ss
left join student_info si on si.status = ss.status
group by ss.status;
STATUS COUNT(SI.STATUS)
------ ----------------
P 12
A 2
T 0
S 1
L 0
The CTE acts as a dummy table holding the five statuses you want to count. That is then outer joined to your real table - the outer join means the rows from the CTE are still included even if there is no match - and then the rows that are matched in your table are counted. That allows the zero counts to be included.
You could also do this with a collection:
select ss.status, count(si.status)
from (
select column_value as status from table(sys.odcivarchar2list('A','L','P','S','T'))
) ss
left join student_info si on si.status = ss.status
group by ss.status;
It would be preferable to have a physical table which holds those values (and their descriptions); you could also then have a primary/foreign key relationship to enforce the allowed values in your existing table.
If all the status values actually appear in your table, but you have a filter which happens to exclude all rows for some of them, then you could get the list of all (used) values from the table itself instead of hard-coding it.
If your initial query was something like this, with a completely made-up filter:
select si.status, count(*)
from student_info si
where si.some_condition = 'true'
group by si.status;
then you could use a subquery to get all the distinct values from the unfiltered table, outer join from that to the same table, and apply the filter as part of the outer join condition:
select ss.status, count(si.status)
from (
select distinct status from student_info
) ss
left join student_info si on si.status = ss.status
and si.some_condition = 'true'
group by ss.status;
It can't stay as a where clause (at least here, where it's applying to the right-hand-side of the outer join) because that would override the outer join and effectively turn it back into an inner join.
You should store somewhere your statuses (pherhaps in another table). Otherwise, you list them using subquery:
with stored_statuses as (
select 'P' code, 'present' description from dual
union all
select 'A' code, 'absent' description from dual
union all
select 'S' code, 'ill' description from dual
union all
select 'T' code, 'transfer' description from dual
union all
select 'L' code, 'left' description from dual
)
select ss.code, count(*) from student_info si
left join stored_statuses ss on ss.code = si.status
group by ss.code

Order by position

Lets say we have two tables
TableA (A1,A2) , TableB(B1,B2)
Is there any difference (in therms of performance, memory usage ) between the two queries (only order by clause positions are different) below in oracle
Select Y.*, ROWNUM rNum FROM (
select * from
TableA a join TableB b on a.A1 = b.B1
Where a.A2 = 'SomeVal'
Order by b.B2
) A
Select Y.*, ROWNUM rNum FROM (
select * from
TableA a join TableB b on a.A1 = b.B1
Where a.A2 = 'SomeVal'
) A
Order by B2
Yes -- in the latter the rownum is assigned prior to the rows being ordered, and in the former the rownum is assigned after the rows are ordered.
So the first query's rownums might read as, "1,2,3,4,5 ...", whereas the second query's rownums might read, "33,3,5,45,1 ..."

How to get minimum unused number from a column in Oracle?

In my database I have a table with column that indicates the code of each record ( aside from ID column ). this field is unique and each time the user tries to insert a record into the table, the first unused code should be assigned to the record. Now the table has the column of codes with the following order :
+------+
code
+------+
1
+------+
2
+------+
3
+------+
5
+------+
I want a query to return 4 as the result.
Note that this query is highly frequent in my system and the best query with minimum execution time will be appreciated.
Is using a self-join acceptable? If so:
-- your test data:
WITH data AS (SELECT 1 AS code FROM DUAL
UNION SELECT 2 FROM DUAL
UNION SELECT 3 FROM DUAL
UNION SELECT 5 FROM DUAL)
-- request:
SELECT COALESCE(MIN(d1.code+1),1)
FROM data d1 LEFT JOIN data d2 ON d1.code+1 = d2.code
WHERE d2.code IS NULL;
This will build the list of data.code without a successor. And using MIN(...+1) you will get the first empty slot. I used COALESCE(...) in order to handle the specific case where there isn't any entry in the data table.
And alternate form using a sequence generator might lead to better performances as is does not require the whole table to be traversed in order to perform the aggregate function MIN():
-- your test data:
WITH data AS (SELECT 1 AS code FROM DUAL
UNION SELECT 5 FROM DUAL
UNION SELECT 2 FROM DUAL
UNION SELECT 3 FROM DUAL)
-- request:
SELECT T.code FROM (SELECT d1.code
FROM (SELECT LEVEL code FROM DUAL CONNECT BY LEVEL < 9999) d1 LEFT JOIN data d2
ON d1.code = d2.code
WHERE d2.code IS NULL
ORDER BY d1.code ASC
) T WHERE ROWNUM < 2
The drawback is you now have an upper limit hard-coded. It might be dynamically inferred from the data table though. So is is not really blocking. I let you compare timings yourself.
this field is unique and each time the user tries to insert a record into the table, the first unused code should be assigned to the record
Please note however this will lead to a race condition if two concurrent sessions try to insert a row at the same time. Given your example, they will both try to insert a row with code = 4 -- obviously both will not succeed in doing so as your column is unique...
I recently use the code below:
SELECT t1.id+1
FROM table t1
LEFT OUTER JOIN table t2 ON (t1.id + 1 = t2.id)
WHERE t2.id IS NULL
/* and rownum = 1 Need to use a sub select if you want this to work */
ORDER BY t1.id;
I run it every time that I need to insert a new row and use the minimum unused id.
I hope it works for your purposes.
select level unusedval from dual connect by level < 10
minus
select tno from t2);
you can change level condition dependents on max value.

Using Linq to select a table based on inner join of top 5 from another table

I am trying to select a table from my database based on top 5 values from another table and met a roadblock.
here is the version without the top 5 values :
from d in Deals
from f in FacebookUserCategories
from s in SubCategories
where s.FacebookCategoryId == f.FacebookCategoryId
&& f.FacebookUserId == 1437585390
orderby f.Count descending
select d
However , what i need is to select Deals based on the top 5 Ids from SubCategories table, meaning i have to use a Take operator.
The below linq will help me achieve this :
(from f in FacebookUserCategories
from s in SubCategories
where s.FacebookCategoryId == f.FacebookCategoryId
orderby f.Count descending
select s.Id).Take(5)
Is there anyway for me to select the Deals table which has a SubCategoryId as a join from here?
Just to recap...i could write the sql.. it would be like this :
SELECT t1.* FROM Deal t1
INNER JOIN (
SELECT TOP 5 t2.Id FROM FacebookUserCategory , SubCategory t2
WHERE FacebookUserId = '1437585390'
AND FacebookUserCategory.FacebookCategoryId = t2.FacebookCategoryId
ORDER BY Count DESC) tbl
ON t1.SubCategoryId = tbl.Id
Try this, Use Join for all 3 tables, You use Join for only 2 tables, why not use Join for 3rd table also.
var result = from d in deals
let top5Counts =
(from f in FacebookUserCategories
join s in SubCategories on f.FacebookCategoryId equals s.FacebookCategoryId
where f.FacebookUserId == 1437585390
orderby f.Count descending
select s.Id).Take(5)
where top5Counts.Contains(d.SubCategoryId.Value)
select d;

Resources