Oracle join returning too many rows - oracle

I have the following select statments in Oracle 19:
select count(*) as facility_load_stg_cnt from facility_load_stg;
select count(*) as facility_cnt from facility;
select count(*) as loctn_typ_cnt from loctn_typ;
select count(*) as join_cnt from (
select f.facility_id, lt.loctn_typ_id
from facility_load_stg stg
inner join facility f on stg.facility_cd = f.facility_cd
inner join loctn_typ lt on stg.bldg = lt.loctn_typ_nm);
the object is simple, get the PK's for facility and loctn_typ (facility_id, loctn_typ_id) to insert into the table that will build the relationship.
The problem is when I run the above code, I get these results:
FACILITY_LOAD_STG_CNT
---------------------
987
FACILITY_CNT
------------
645
LOCTN_TYP_CNT
-------------
188
JOIN_CNT
----------
2905
why is the select with the join have 3x the rows of the facility_load_stg table? I am sure I am doing something silly, I just cannot see it.
Folks have asked to see the table definitions, here is the relavent parts:
create table FACILITY_MAINT_DATA.FACILITY_LOAD_STG
(
FACILITY_CD VARCHAR2(100) not null,
BLDG VARCHAR2(100)
)
create table FACILITY_MAINT_DATA.FACILITY
(
FACILITY_CD VARCHAR2(100),
FACILITY_ID NUMBER(14) not null
constraint PK_FACILITY
primary key
)
create table FACILITY_MAINT_DATA.LOCTN_TYP
(
LOCTN_TYP_ID NUMBER(14) default "FACILITY_MAINT_DATA"."LOCTN_TYP_ID_SEQ"."NEXTVAL" not null
constraint PK_LOCTN_TYP
primary key,
LOCTN_TYP_NM VARCHAR2(100),
)

The generall rule if you join and you encounter a higher count that you expect is that the join keys are not unique*
Here a simple example reproducting the same result as you have (limiting only to two tables).
create table tab1 as
select rownum id,
case when rownum <= 80 then 'CD_'||rownum else 'CD_X' end cd from dual connect by level <= 645;
create table tab2 as
select rownum id,
case when rownum <= 80 then 'CD_'||rownum
when rownum <= 85 then 'CD_X'
else 'CD_Y' end cd from dual connect by level <= 987;
select count(*) from tab1;
COUNT(*)
----------
645
select count(*) from tab2;
COUNT(*)
----------
987
select count(*)
from tab1
join tab2
on tab1.cd = tab2.cd;
COUNT(*)
----------
2905
Summary change the join to use the unique keys or limit the join only to such CDcolumns that are unique.

Thank you everyone for the comments, that got me looking at detail at the actual data and there are duplications. Now I just have to figure out why, just need some investigation:)

Related

(Oracle)Correlated subquery usage

Subquery query1 below works fine.
But when I put equi condition in -sort of- nested clause like query2, it shows error ORA-00904.
Is this wrong usage of correlated subquery or it is because of other reason?
--Query1: It shows expected result.
SELECT
O.ENAME
O,SAL
,(SELECT COUNT(*)
FROM SCOTT.EMP I
WHERE I.SAL>O.SAL --correlated to outer
) AS RESULT
from SCOTT.EMP O;
--Query2:ORA-00904: "O"."SAL": invalid identifier shows. How to modify to use correlated subquery?
SELECT
O.ENAME
O,SAL
,(
WITH TEMP AS
(
SELECT COUNT(*)
FROM SCOTT.EMP I
WHERE I.SAL>O.SAL --I have put equi condistion here
)
SELECT * FROM TEMP
) AS RESULT
from SCOTT.EMP O;
I believe the second option is a wrong use of correlated subqueries, not because of the comparison, but for the use of the with clause. I would like to remember that you should avoid correlated subqueries as much as possible.
The WITH clause, or subquery factoring clause, may be processed as an inline view or resolved as a temporary table. The advantage of the latter is that repeated references to the subquery may be more efficient as the data is easily retrieved from the temporary table, rather than being requeried by each reference.
In your third column of your second query you want to get the result from the inline view. The problem is that parsing of the inline view is done independently and therefore cannot have references to anything in the outer query.
SQL> create table emp ( ename varchar2(10) , sal number ) ;
Table created.
SQL> insert into emp values ( 'AAA' , 1000 ) ;
insert into emp values ( 'BBB' , 1000 ) ;
insert into emp values ( 'CCC' , 1000 ) ;
insert into emp values ( 'DDD' , 1000 ) ;
1 row created.
SQL> SQL>
1 row created.
SQL> SQL>
1 row created.
SQL> SQL>
1 row created.
SQL> select * from emp ;
ENAME SAL
---------- ----------
AAA 1000
BBB 1000
CCC 1000
DDD 1000
To write the query with inline view, the filter must be done in the outer query
SELECT
O.ENAME
O,SAL
,(
WITH TEMP AS
(
SELECT * FROM EMP
)
SELECT count(*) FROM TEMP t WHERE t.SAL>O.SAL
) AS RESULT
from EMP O;
O SAL RESULT
---------- ---------- ----------
AAA 1000 0
BBB 1000 0
CCC 1000 0
DDD 1000 0
It has been explained that the second query does not uses with correctly.
Let me suggest, however, that your query can be simpler and more efficiently phrased. For each employee, you want to count how many employees have a greater salary. Window functions are the way to go here:
select e.*, rank() over(order by salary desc) - 1 result
from scott.emp e

insert all and inner join in oracle

I would like to insert data in to two tables. Will be one-to-many connection. For this, I have to use Foreign Key, of course.
I think, table1 - ID column is an ideal for this a Primary Key. But I generate it always with a trigger, automatically, every line. SO,
How can I put Table1.ID (auto generated, Primary Key) column in to table2.Fkey column in the same insert query?
INSERT ALL INTO table1 ( --here (before this) generated the table1.id column automatically with a trigger.
table1.food,
table1.drink,
table1.shoe
) VALUES (
'apple',
'water',
'slippers'
)
INTO table2 (
fkey,
color
) VALUES (
table1.id, -- I would like table2.fkey == table1.id this gave me error
'blue'
) SELECT
*
FROM
table1
INNER JOIN table2 ON table1.id = table2.fkey;
The error message:
"00904. 00000 - "%s: invalid identifier""
As suggested by #OldProgrammer, use sequence
INSERT ALL INTO table1 ( --here (before this) generated the table1.id column automatically with a trigger.
table1_id,
table1.food,
table1.drink,
table1.shoe
) VALUES (
<sequecename_table1>.nextval,
'apple',
'water',
'slippers'
)
INTO table2 (
fkey,
color
) VALUES (
<sequecename_table2>.nextval,
<sequecename_table1>.currval, -- returns the current value of a sequence.
'blue'
) SELECT
*
FROM
table1
INNER JOIN table2 ON table1.id = table2.fkey;
Since you're using Oracle DB's 12c version, then might use Identity Column Property. Then easily return the value of first table's (table1) to a local variable by charging of returning clause just after an insert statement for table1, and use inside the next insert statement which is for table2 as stated below :
SQL> create table table1(
2 ID integer generated always as identity primary key,
3 food varchar2(50), drink varchar2(50), shoe varchar2(50)
4 );
SQL> create table table2(
2 fkey integer references table1(ID),
3 color varchar2(50)
4 );
SQL> declare
2 cl_tab table1.id%type;
3 begin
4 insert into table1(food,drink,shoe) values('apple','water','slippers' )
5 returning id into cl_tab;
6 insert into table2 values(cl_tab,'blue');
7 end;
8 /
SQL> select * from table1;
ID FOOD DRINK SHOE
-- ------- ------- -------
1 apple water slippers
SQL> select * from table2;
FKEY COLOR
---- --------------------------------------------------
1 blue
Anytime you issue the above statement for insertions between begin and end, both table1.ID and table2.fkey columns will be populated by the same integer values. By the way do not forget to commit the changes by insertions, if you need these values throughout the DB(i.e.from other sessions also).

Query taking long when i use user defined function with order by in oracle select

I have a function, which will get greatest of three dates from the table.
create or replace FUNCTION fn_max_date_val(
pi_user_id IN number)
RETURN DATE
IS
l_modified_dt DATE;
l_mod1_dt DATE;
l_mod2_dt DATE;
ret_user_id DATE;
BEGIN
SELECT MAX(last_modified_dt)
INTO l_modified_dt
FROM table1
WHERE id = pi_user_id;
-- this table contains a million records
SELECT nvl(MAX(last_modified_ts),sysdate-90)
INTO l_mod1_dt
FROM table2
WHERE table2_id=pi_user_id;
-- this table contains clob data, 800 000 records, the table 3 does not have user_id and has to fetched from table 2, as shown below
SELECT nvl(MAX(last_modified_dt),sysdate-90)
INTO l_mod2_dt
FROM table3
WHERE table2_id IN
(SELECT id FROM table2 WHERE table2_id=pi_user_id
);
execute immediate 'select greatest('''||l_modified_dt||''','''||l_mod1_dt||''','''||l_mod2_dt||''') from dual' into ret_user_id;
RETURN ret_user_id;
EXCEPTION
WHEN OTHERS THEN
return SYSDATE;
END;
this function works perfectly fine and executes within a second.
-- random user_id , just to test the functionality
SELECT fn_max_date_val(100) as max_date FROM DUAL
MAX_DATE
--------
27-02-14
For reference purpose i have used the table name as table1,table2 and table3 but my business case is similar to what i stated below.
I need to get the details of the table1 along with the highest modified date among the three tables.
I did something like this.
SELECT a.id,a.name,a.value,fn_max_date_val(id) as max_date
FROM table1 a where status_id ='Active';
The above query execute perfectly fine and got result in millisecods. But the problem came when i tried to use order by.
SELECT a.id,a.name,a.value,a.status_id,last_modified_dt,fn_max_date_val(id) as max_date
FROM table1 where status_id ='Active' a
order by status_id desc,last_modified_dt desc ;
-- It took almost 300 seconds to complete
I tried using index also all the values of the status_id and last_modified, but no luck. Can this be done in a right way?
How about if your query is like this?
select a.*, fn_max_date_val(id) as max_date
from
(SELECT a.id,a.name,a.value,a.status_id,last_modified_dt
FROM table1 where status_id ='Active' a
order by status_id desc,last_modified_dt desc) a;
What if you don't use the function and do something like this:
SELECT a.id,a.name,a.value,a.status_id,last_modified_dt x.max_date
FROM table1 a
(
select max(max_date) as max_date
from (
SELECT MAX(last_modified_dt) as max_date
FROM table1 t1
WHERE t1.id = a.id
union
SELECT nvl(MAX(last_modified_ts),sysdate-90) as max_date
FROM table2 t2
WHERE t2.table2_id=a.id
...
) y
) x
where a.status_id ='Active'
order by status_id desc,last_modified_dt desc;
Syntax might contain errors, but something like that + the third table in the derived table too.

subquery inside INSERT statement

I just recently found out that subqueries are not allowed in INSERT statements that are inside stored procedures. This is my script:
begin
execute immediate 'truncate table itcustadm.GL_DTPJ_TEST2';
insert into GL_DTPJ_TEST2
(rule_no,
posted_by_user_id,
transaction_id,
transaction_sr_no,
dr_amount,
cr_amount,
tran_crncy_code,
bkdt_tran_flg,
bank_desc
)
select
tq.rule_no,
tq.posted_by_user_id,
tq.transaction_id,
tq.transaction_sr_no,
tq.dr_amount,
tq.cr_amount,
tq.tran_crncy_code,
tq.bkdt_tran_flg,
(select ent.bank_desc from crmuser.end ent where ent.bank_id = gam.bank_id);
But since the (select ent.bank_desc from crmuser.end ent where ent.bank_id = gam.bank_id) at the bottom of the SELECT statement is not allowed by Oracle, what's the best way to accomplish this?
I actually have this code right before the INSERT statement, but I don't know how to exactly use it:
get_bank_desc := '(select ent.bank_desc from crmuser.end ent ' ||
'where ent.bank_id = gam.bank_id)';
I am not sure what you are exactly trying for, but below code may be useful for you, you can achieve inserting a SubQuery output into a table using below query sample, but make sure output of the SubQuery is a single row o/p, so that you can escape from "ORA-01427: single-row SubQuery returns more than one row" ERROR.
insert into test_ins1
values(1,(SELECT COL2 FROM TEST_INS WHERE COL1=1 ));
Even then you can use rownum in where condition and take the single value.
Please let me know in case of any doubts
declare
bank_desc_temp bank_desk_type; /* the type defined in crmuser.ent for bank_desc*/
begin
select ent.bank_desc into bank_desc_temp from crmuser.end ent where ent.bank_id = gam.bank_id;
execute immediate 'truncate table itcustadm.GL_DTPJ_TEST2';
insert into GL_DTPJ_TEST2
(rule_no,
posted_by_user_id,
transaction_id,
transaction_sr_no,
dr_amount,
cr_amount,
tran_crncy_code,
bkdt_tran_flg,
bank_desc
)
select
tq.rule_no,
tq.posted_by_user_id,
tq.transaction_id,
tq.transaction_sr_no,
tq.dr_amount,
tq.cr_amount,
tq.tran_crncy_code,
tq.bkdt_tran_flg,
bank_desc_temp;
end;
When you say "not allowed" what do you mean? Did you get an error?
I ask, because subqueries are definitely allowed inside an insert as select statement, providing you have the syntax correct (and the subquery returns at most one row), e.g.:
create table test_tab (col1 number, col2 varchar2(10));
begin
insert into test_tab
select 1,
(select 'Yes' from dual d2 where d.dummy = d2.dummy)
from dual d;
commit;
end;
/
select * from test_tab;
COL1 COL2
---------- ----------
1 Yes
There are some syntax issues with the code you provided - where is the from clause, and where are the tq and gam aliases defined?
There are two syntax you can use in your insert statement:
(I)
INSERT INTO table_name( column1, column2....columnN)
VALUES ( value1, value2....valueN);
(II)
INSERT INTO table (column1, column2, ... )
SELECT expression1, expression2, ...
FROM source_table(s)
WHERE conditions;
In your example, you should choose the second approach:
insert into GL_DTPJ_TEST2 (rule_no,
posted_by_user_id,
transaction_id,
transaction_sr_no,
dr_amount,
cr_amount,
tran_crncy_code,
bkdt_tran_flg,
bank_desc
)
select tq.rule_no,
tq.posted_by_user_id,
tq.transaction_id,
tq.transaction_sr_no,
tq.dr_amount,
tq.cr_amount,
tq.tran_crncy_code,
tq.bkdt_tran_flg,
ent.bank_desc
from crmuser.gam
join crmuser.end ent
on ent.bank_id = gam.bank_id
;
basically, if you want to add records using an insert statement, you should use a full select statement first. Here is how I would do it:
(1)
select *
from table1;
(2)
select column1
,column2
,column3
from table1;
(3)
select t1.column1
,t1.column2
,t1.column3
,t2.column4
,t2.column5
from table1 t1
join table2 t2
on t2.id = t1.id
;
(4)
insert into table3 (col1
,col2
,col3
,col4
,col5)
select t1.column1
,t1.column2
,t1.column3
,t2.column4
,t2.column5
from table1 t1
join table2 t2
on t2.id = t1.id
;

How to find records in one table which aren't in another

I'm very new to oracle sql and programming and I need some help with one of my first projects. I'm working with this table schema:
Column Data Type Length Precision Scale Nullable
EMPLOYEE_ID NUMBER 22 6 0 No
START_DATE DATE 7 - - No
END_DATE DATE 7 - - No
JOB_ID VARCHAR2 10 - - No
DEPARTMENT_ID NUMBER 22 4 0 Yes
I want to display all employees who have never changed their jobs, not even once(employees not listed in the above table) This table is labeled job_history. How would I go about doing this? I'm not sure on how to get this started.
select * from employees
where employee_id not in ( select employee_id from job_history)
/
You can use a left join and check for a null employee_id on the job_history table.
select * from employees
left join job_history
on job_history.employee_id = employees.employee_id
where job_history.employee_id is NULL
Fairly often an execution plan is better for
select * from employees e where not exists
(select 1 from job_history jh.employee_id = e.employee_id)
than "not in ()".
And if the tables have the same structure the best result will with
select * from employees
minus
select * from job_history

Resources