PL/SQL : Need to compare data for every field in a table in plsql - oracle

I need to create a procedure which will take collection as an input and compare the data with staging table data row by row for every field (approx 50 columns).
Business logic :
whenever a staging table column value will mismatch with the corresponding collection variable value then i need to update 'FAIL' into staging table STATUS column and reason into REASON column for that row.
If matched then need to update 'SUCCESS' in STATUS column.
Payload will be approx 500 rows in each call.
I have created below sample script:
PKG Specification :
CREATE OR REPLACE
PACKAGE process_data
IS
TYPE pass_data_rec
IS
record
(
p_eid employee.eid%type,
p_ename employee.ename%type,
p_salary employee.salary%type,
p_dept employee.dept%type
);
type p_data_tab IS TABLE OF pass_data_rec INDEX BY binary_integer;
PROCEDURE comp_data(inpt_data IN p_data_tab);
END;
PKG Body:
CREATE OR REPLACE
PACKAGE body process_data
IS
PROCEDURE comp_data (inpt_data IN p_data_tab)
IS
status VARCHAR2(10);
reason VARCHAR2(1000);
cnt1 NUMBER;
v_eid employee_copy.eid%type;
v_ename employee_copy.ename%type;
BEGIN
FOR i IN 1..inpt_data.count
LOOP
SELECT ec1.eid,ec1.ename,COUNT(*) over () INTO v_eid,v_ename,cnt1
FROM employee_copy ec1
WHERE ec1.eid = inpt_data(i).p_eid;
IF cnt1 > 0 THEN
IF (v_eid=inpt_data(i).p_eid AND v_ename = inpt_data(i).p_ename) THEN
UPDATE employee_copy SET status = 'SUCCESS' WHERE eid = inpt_data(i).p_eid;
ELSE
UPDATE employee_copy SET status = 'FAIL' WHERE eid = inpt_data(i).p_eid;
END IF;
ELSE
NULL;
END IF;
END LOOP;
COMMIT;
status :='success';
EXCEPTION
WHEN OTHERS THEN
status:= 'fail';
--reason:=sqlerrm;
END;
END;
But in this approach i have below mentioned issues.
Need to declare all local variables for each column value.
Need to compare all variable data using 'and' operator. Not sure whether it is correct way or not because if there are 50 columns then if condition will become very heavy.
IF (v_eid=inpt_data(i).p_eid AND v_ename = inpt_data(i).p_ename) THEN
Need to update REASON column when any column data mismatched (first mismatched column name) for that row, in this approach i am not able to achieve.
Please suggest any other good way to achieve this requirement.
Edit :
There is only one table at my end i.e target table. Input will come from any other source as collection object.

REVISED Answer
You could load the the records into t temp table, but unless you want additional processing it's not necessary. AFAIK there is no way to identify the offending column (first one only) without slugging through column-by-column. However, your other concern having to declare a variable is not necessary. You can declare a single variable defined as %rowtype which gives you access to each column by name.
Looping through an array of data to find the occasional error is just bad (imho) with SQL available to eliminate the good ones in one fell swoop. And it's available here. Even though your input is a array we can use as a table by using the TABLE operator, which allows an array (collection) as though it were a database table. So the MINUS operator can till be employed. The following routine will set the appropriate status and identify the first miss matched column for each entry in the input array. It reverts to your original definition in package spec, but replaces the comp_data procedure.
create or replace package body process_data
is
procedure comp_data (inpt_data in p_data_tab)
is
-- define local array to hold status and reason for ecah.
type status_reason_r is record
( eid employee_copy.eid%type
, status employee_copy.status%type
, reason employee_copy.reason%type
);
type status_reason_t is
table of status_reason_r
index by pls_integer;
status_reason status_reason_t := status_reason_t();
-- define error array to contain the eid for each that have a mismatched column
type error_eids_t is table of employee_copy.eid%type ;
error_eids error_eids_t;
current_matched_indx pls_integer;
/*
Helper function to identify 1st mismatched column in error row.
Here is where we slug our way through each column to find the first column
value mismatch. Note: There is actually validate the column sequence, but
for purpose here we'll proceed in the input data type definition.
*/
function identify_mismatch_column(matched_indx_in pls_integer)
return varchar2
is
employee_copy_row employee_copy%rowtype;
mismatched_column employee_copy.reason%type;
begin
select *
into employee_copy_row
from employee_copy
where employee_copy.eid = inpt_data(matched_indx_in).p_eid;
-- now begins the task of finding the mismatched column.
if employee_copy_row.ename != inpt_data(matched_indx_in).p_ename
then
mismatched_column := 'employee_copy.ename';
elsif employee_copy_row.salary != inpt_data(matched_indx_in).p_salary
then
mismatched_column := 'employee_copy.salary';
elsif employee_copy_row.dept != inpt_data(matched_indx_in).p_dept
then
mismatched_column := 'employee_copy.dept';
-- elsif continue until ALL columns tested
end if;
return mismatched_column;
exception
-- NO_DATA_FOUND is the one error that cannot actually be reported in the customer_copy table.
-- It occurs when an eid exista in the input data but does not exist in customer_copy.
when NO_DATA_FOUND
then
dbms_output.put_line( 'Employee (eid)='
|| inpt_data(matched_indx_in).p_eid
|| ' does not exist in employee_copy table.'
);
return 'employee_copy.eid ID is NOT in table';
end identify_mismatch_column;
/*
Helper function to find specified eid in the initial inpt_data array
Since the resulting array of mismatching eid derive from a select without sort
there is no guarantee the index values actually match. Nor can we sort to build
the error array, as there is no way to know the order of eid in the initial array.
The following helper identifies the index value in the input array for the specified
eid in error.
*/
function match_indx(eid_in employee_copy.eid%type)
return pls_integer
is
l_at pls_integer := 1;
l_searching boolean := true;
begin
while l_at <= inpt_data.count
loop
exit when eid_in = inpt_data(l_at).p_eid;
l_at := l_at + 1;
end loop;
if l_at > inpt_data.count
then
raise_application_error( -20199, 'Internal error: Find index for ' || eid_in ||' not found');
end if;
return l_at;
end match_indx;
-- Main
begin
-- initialize status table for each input enter
-- additionally this results is a status_reason table in a 1:1 with the input array.
for i in 1..inpt_data.count
loop
status_reason(i).eid := inpt_data(i).p_eid;
status_reason(i).status :='SUCCESS';
end loop;
/*
We can assume the majority of data in the input array is valid meaning the columns match.
We'll eliminate all value rows by selecting each and then MINUSing those that do match on
each column. To accomplish this cast the input with TABLE function allowing it's use in SQL.
Following produces an array of eids that have at least 1 column mismatch.
*/
select p_eid
bulk collect into error_eids
from (select p_eid, p_ename, p_salary, p_dept from TABLE(inpt_data)
minus
select eid, ename, salary, dept from employee_copy
) exs;
/*
The error_eids array now contains the eid for each miss matched data item.
Mark the status as failed, then begin the long hard process of identifying
the first column causing the mismatch.
The following loop used the nested functions to slug the way through.
This keeps the main line logic clear.
*/
for i in 1 .. error_eids.count -- if all inpt_data rows match then count is 0, we bypass the enttire loop
loop
current_matched_indx := match_indx(error_eids(i));
status_reason(current_matched_indx).status := 'FAIL';
status_reason(current_matched_indx).reason := identify_mismatch_column(current_matched_indx);
end loop;
-- update employee_copy with appropriate status for each row in the input data.
-- Except for any cid that is in the error eid table but doesn't exist in the customer_copy table.
forall i in inpt_data.first .. inpt_data.last
update employee_copy
set status = status_reason(i).status
, reason = status_reason(i).reason
where eid = inpt_data(i).p_eid;
end comp_data;
end process_data;
There are a couple other techniques used you may want to look into if you are not familiar with them:
Nested Functions. There are 2 functions defined and used in the procedure.
Bulk Processing. That is Bulk Collect and Forall.
Good Luck.
ORIGINAL Answer
It is NOT necessary to compare each column nor build a string by concatenating. As you indicated comparing 50 columns becomes pretty heavy. So let the DBMS do most of the lifting. Using the MINUS operator does exactly what you need.
... the MINUS operator, which returns only unique rows returned by the
first query but not by the second.
Using that this task needs only 2 Updates: 1 to mark "fail", and 1 to mark "success". So try:
create table e( e_id integer
, col1 varchar2(20)
, col2 varchar2(20)
);
create table stage ( e_id integer
, col1 varchar2(20)
, col2 varchar2(20)
, status varchar2(20)
, reason varchar2(20)
);
-- create package spec and body
create or replace package process_data
is
procedure comp_data;
end process_data;
create or replace package body process_data
is
package body process_data
procedure comp_data
is
begin
update stage
set status='failed'
, reason='No matching e row'
where e_id in ( select e_id
from (select e_id, col1, col2 from stage
except
select e_id, col1, col2 from e
) exs
);
update stage
set status='success'
where status is null;
end comp_data;
end process_data;
-- test
-- populate tables
insert into e(e_id, col1, col2)
select (1,'ABC','def') from dual union all
select (2,'No','Not any') from dual union all
select (3,'ok', 'best ever') from dual union all
select (4,'xx','zzzzzz') from dual;
insert into stage(e_id, col1, col2)
select (1,'ABC','def') from dual union all
select (2,'No','Not any more') from dual union all
select (4,'yy', 'zzzzzz') from dual union all
select (5,'no e','nnnnn') from dual;
-- run procedure
begin
process_data.comp_date;
end;
-- check results
select * from stage;
Don't ask. Yes, you to must list every column you wish compared in each of the queries involved in the MINUS operation.
I know the documentation link is old (10gR2), but actually finding Oracle documentation is a royal pain. But the MINUS operator still functions the same in 19c;

Related

Record type, Collection and Bulk collect at the same time in oracle plsql

I am facing a challenge in implementing a scenario in code.
I am trying to use record type, collections and bulk collect at the same time during a proof of concept. But I am unable to and I am getting errors.
I don't know how to pass the bulk collect argument as an input parameter to the proc which I had created in the package below...
CREATE OR REPLACE PACKAGE poc1
AS
TYPE poc_rectype IS RECORD
(
id VARCHAR2 (20),
name VARCHAR2 (20)
);
PROCEDURE poc1_prc (poc_rec1 IN poc_rectype);
END poc1;
CREATE OR REPLACE PACKAGE BODY poc1
AS
PROCEDURE poc1_prc (poc_rec1 IN poc_rectype)
IS
BEGIN
FOR i IN 1 .. poc_rec1.COUNT
LOOP
DBMS_OUTPUT.PUT_LINE ('poc_rec1' || poc_rec1.COUNT);
END LOOP;
*-- i want to print the records passed from the execution script here
-- later i want to do some insertion in some table..*
DBMS_OUTPUT.PUT_LINE ('executed');
END poc1_prc;
END poc1;
Here I am trying to pass only one record for now..
But, I wish to pass a collection of records and print it out or do some insertion in the package containing the procedure above.
/* execution script for the above package*/
DECLARE
l_rec_type poc1.poc_rectype;
BEGIN
SELECT (SELECT 100, 'Jack' FROM DUAL)
BULK COLLECT INTO l_rec_type
FROM DUAL;
poc1.poc1_prc (l_rec_type);
END;
Please could someone help me on implementing this POC.
I tried everything. but i am feeling helpless
You were close, but you were missing a nested table to hold the values. You had a record type and a record variable. But a record variable can only hold a single row of data. To hold multiple rows of data, you need a record type, a nested table, and a nested table variable.
Here's the package to contain the types and process the data:
CREATE OR REPLACE PACKAGE poc1
AS
TYPE poc_rectype IS RECORD
(
id VARCHAR2 (20),
name VARCHAR2 (20)
);
TYPE poc_tab is table of poc_rectype;
PROCEDURE poc1_prc (poc_recs IN poc_tab);
END poc1;
/
CREATE OR REPLACE PACKAGE BODY poc1
AS
PROCEDURE poc1_prc (poc_recs IN poc_tab)
IS
BEGIN
FOR i IN 1 .. poc_recs.COUNT
LOOP
DBMS_OUTPUT.PUT_LINE ('poc_recs.id: ' || poc_recs(i).id);
END LOOP;
END poc1_prc;
END poc1;
/
Here's an anonymous block that populates the nested table variable and passes it to the collection for processing:
DECLARE
l_pocs poc1.poc_tab;
BEGIN
SELECT id, name
BULK COLLECT INTO l_pocs
FROM
(
SELECT 100 id, 'Jack' name FROM DUAL UNION ALL
SELECT 101 id, 'Jill' name FROM DUAL
);
poc1.poc1_prc(l_pocs);
END;
/
Output:
-------
poc_recs.id: 100
poc_recs.id: 101
Since you tagged the question with 10g, you might need to add an extra step, and create the record type and nested table as separate variables. Older versions of Oracle couldn't always convert from SQL to PL/SQL types.

Inserting into a table using a procedure only if the record doesn't exist yet

I have a table that i'm trying to populate via a plsql script (runs on plsql developer). The actual DML statement
is contained in a procedure inside a package. The procedure only inserts if the record doesn't exist yet.
It doesn't work. The part that checks for existence returns true after the first iteration of the script loop even if it doesn't actually exist in the table.
If i put the commit outside of the loop, nothing gets inserted at all and the existence checks return true for all iteration even if the table it empty.
When i try to simplify the insert with existence check to be in just one statement without the exception handling, i get the same outcome.
Please tell me what I'm doing wrong here.
CREATE OR REPLACE PACKAGE BODY some_package
IS
PROCEDURE add_to_queue(id IN NUMBER)
IS
pending_record VARCHAR2(1);
BEGIN
-- this part succeeds even if nothing matches the criteria
-- during the loop in the outside script
SELECT 'Y'
INTO pending_record
FROM dual
WHERE EXISTS (SELECT 'x' FROM some_queue smq
WHERE smq.id = id AND smq.status IS NULL);
EXCEPTION
WHEN NO_DATA_FOUND THEN
INSERT INTO some_queue (seqno, id, activity_date)
VALUES (some_sequence.nextval, id, SYSDATE);
WHEN OTHERS THEN
NULL;
END;
END some_package;
CREATE TABLE some_queue
(
seqno VARCHAR2(500) NOT NULL,
id NUMBER NOT NULL,
activity_date DATE NOT NULL,
status VARCHAR2(25),
CONSTRAINT some_queue_pk PRIMARY KEY (seqno)
);
-- script to randomly fill in the table with ids from another table
declare
type ids_coll_tt is table of number index by pls_integer;
ids_coll_table ids_coll_tt;
cursor ids_coll_cur is
select tab.id
from (select *
from ids_source_table
order by dbms_random.value ) tab
where rownum < 10;
begin
open ids_coll_cur;
fetch ids_coll_cur bulk collect into ids_coll_table;
close ids_coll_cur;
for x in 1..ids_coll_table.count
loop
some_package.add_to_queue(ids_coll_table(x));
commit; -- if this is here, the first iteration gets inserted
end loop;
-- commit; -- if the commit is done here, nothing gets inserted
end;
Note: I translated this code to be more generic for posting. Forgive me if there are any typos.
Update: even if i put everything inside the script and not use the package, i'm not able to properly check for existence and I get the same results.
I figured out the solution:
CREATE OR REPLACE PACKAGE BODY some_package
IS
PROCEDURE add_to_queue(p_id IN NUMBER)
IS
pending_record VARCHAR2(1);
BEGIN
-- this part succeeds even if nothing matches the criteria
-- during the loop in the outside script
SELECT 'Y'
INTO pending_record
FROM dual
WHERE EXISTS (SELECT 'x' FROM some_queue smq
WHERE smq.id = p_id AND smq.status IS NULL);
EXCEPTION
WHEN NO_DATA_FOUND THEN
INSERT INTO some_queue (seqno, id, activity_date)
VALUES (some_sequence.nextval, p_id, SYSDATE);
WHEN OTHERS THEN
NULL;
END;
END some_package;
changing the parameter name fixed it. I guess the compiler gets confused if it's the same name as the table field.
Don't name the parameter the same as the column (use a prefix like p_ or in_) and you can do it in a single statement if you use a MERGE statement self-joining on the ROWID pseudo-column:
CREATE OR REPLACE PACKAGE BODY some_package
IS
PROCEDURE add_to_queue(
in_id IN NUMBER
)
IS
BEGIN
MERGE INTO some_queue dst
USING ( SELECT ROWID AS rid
FROM some_queue
WHERE id = in_id
AND status IS NULL ) src
ON ( src.rid = dst.ROWID )
WHEN NOT MATCHED THEN
INSERT (seqno, id, activity_date)
VALUES (some_sequence.nextval, in_id, SYSDATE);
END;
END some_package;

While INSERT got error PLS-00904: stud.col3 is invalid identifier

In my stored procedure I want if the value of col1 & col2 match with employee then insert the unique record of the employee. If not found then match the value of col1, col2 & col3 with employee match then insert the value. If also not found while match all these column then insert the record by using another column.
Also one more thing that I want find list of values like emp_id by passing the another column value and if a single record can not match then make emp_id as NULL.
Also I want to insert one record at a time after match with txt along with others table having data like emp.
create or replace procedure sp_ex
as
cursor c1 is select * from txt%rowtype;
v_col1 tbl1.col1%type;
type record is table of txt%rowtype; --Staging table
v_rc record := record();
begin
open c1;
loop
fetch c1 bulk collect into v_rc limit 1000;
loop
for i in 1..v_rc.count loop
select col1 into v_col1 from tbl1
where exists (select col1 from tbl1 where tbl1.col1 = emp.col1);
insert
when txt.col1 = emp.col1 and txt.col2 = stud.col2 then
into main_table(columns) values(v_rc(i).col1, ...)
when txt.col1 = emp.col1 and txt.col2 = stud.col2 and txt.col3 = stud.col3 then
into main_table(columns) values(v_rc(i).col1, ...)
else
insert into main_table(columns) values(v_rc(i).col1, ...)
select * from txt;
end loop;
exit when v_rc.count < limit;
end loop;
close c1;
end sp_ex;
While emp, stud are the different tables where i have to match with txt.
In that Stored Proc I want to load data from txt into main_table in batch processing mode. The data would be match one by one record then after if matching condition match then load into the main table. How can i create the stored proc so that the Data will load by above logic one by one in batch processing. Could you please help me to share your idea. Thanks
The syntax seems to be rather mixed up.
Multi-table insert is like this:
insert all -- alternatively, "insert first"
when dummy = 'X' then
into demo (id) values (1)
when dummy = 'Y' then
into demo (id) values (2)
else
into demo (id) values (3)
select * from dual;
Or perhaps you wanted a PL/SQL case statement:
case
when dummy = 'X' then
insert into demo (id) values (1);
when dummy = 'Y' then
insert into demo (id) values (2);
else
insert into demo (id) values (3);
end case;
Instead there seems to be a mixture of the two.
Also there is a missing end loop, and an implicit cursor (select col1 from tbl1) with no into clause.

Code not working. It takes an eternity to run

The code below runs for an eternity.
As you can see i have to take values from one table and use that value to check if the second table contains it or not and insert into the third table values from the first table.
Is there any other way of doing this?
create or replace PROCEDURE KPI_AVAILABILITY (
v_programid varchar2
)
AS
v_MASTER_KPI_ID number;
v_UDF varchar2(100);
v_count number;
cursor c1 is
(select MASTER_KPI_ID,UDF from KPI_MASTER
where UDF is not null
and ISACTIVE = 1
--order by MASTER_KPI_ID,udf
);
BEGIN
open c1 ;
fetch c1 into v_MASTER_KPI_ID,v_UDF;
while v_UDF is not null
loop
select count(v_UDF) into v_count
from vw_ticket
where v_UDF is not null
and amsprogramid = v_programid;
if v_count is not null or v_count <> 0 then
delete from program_kpi where amsprogramid = v_programid;
INSERT INTO PROGRAM_KPI (AMSPROGRAMID,MASTER_KPI_ID,LASTUPDATEDBYDATALOAD)
VALUES(V_PROGRAMID,v_MASTER_KPI_ID,to_char(sysdate,'dd-mon-yy hh.mi.ss'));
dbms_output.put_line('xyz');
end if;
end loop;
close c1;
END KPI_AVAILABILITY;
Reverse engineering business rules from another developer's code is always tricky, especially without understanding the wider domain. However, at the centre of the loop is DELETE from program_kpi followed by an INSERT into the same table. If there are no records matching on amsprogramid = v_programid then you're inserting a record, if there are matches then effectively you're just updating lastupdatedbydataload with the current SYSDATE.
In others, it appears to be the logic of a MERGE. So perhaps your code could be entirely replaced with a single statement. If so, this is likely to be a lot more efficient than the row-by-agonizing-row process within a cursor loop.
merge into program_kpi pkpi
using (select kpim.master_kpi_id
, kpim.udf
, v_programid
from kpi_master kpim
where kpim.udf is not null
and kpim.isactive = 1
and exists ( select null
from vw_ticket tkt
where tkt.amsprogramid = v_programid)
) kpim
on (kpim.v_programid = pkpi.programid
and kpim.master_kpi_id = pkpi.master_kpi_id)
when not matched then
insert values (kpim.v_programid, kpim.master_kpi_id, sysdate)
when matched then
update
set pkpi.lastupdatedbydataload = sysdate;
Please check the results of this code with your expected outcome. As I said, reverse-engineering business logic is hard, and matching on master_kpi_id as well as programid is not the same as just deleting on programid.
You do not change v_UDF after first fetch. Then loop compare it with same first value... compare and compare... compare and compare.

How to return a Cursor for pl/sql table

I select data from several tables. Then i need to edit the data returned from the cursor before returning. The cursor will then be passed to a perl script to display the rows.
To that i build a pl/sql table as in the following code. What i need to know is how to return the to that table ?
At present i get the error "table or view doesn't exist". Test code i use for a simple table is attached here.
CREATE OR REPLACE FUNCTION test_rep
RETURN SYS_REFCURSOR
AS
CURSOR rec_Cur IS
SELECT table1.NAME,
table1.ID
FROM TESTREPORT table1;
TYPE rec_Table IS TABLE OF rec_Cur%ROWTYPE INDEX BY PLS_INTEGER;
working_Rec_Table rec_Table;
TYPE n_trade_rec IS RECORD
(
NAME VARCHAR2(15),
ID NUMBER
);
TYPE ga_novated_trades IS TABLE OF n_trade_rec index by VARCHAR2(15);
va_novated_trades ga_novated_trades;
v_unique_key VARCHAR2(15);
TYPE db_cursor IS REF CURSOR;
db_cursor2 db_cursor;
BEGIN
OPEN rec_Cur;
FETCH rec_Cur BULK COLLECT INTO working_Rec_Table;
FOR I IN 1..working_Rec_Table.COUNT LOOP
v_unique_key := working_Rec_Table(I).NAME;
va_novated_trades(v_unique_key).NAME := working_Rec_Table(I).NAME;
va_novated_trades(v_unique_key).ID := working_Rec_Table(I).ID;
END LOOP; --FOR LOOP
OPEN db_cursor2 FOR SELECT * FROM va_novated_trades; --ERROR LINE
CLOSE rec_Cur;
RETURN db_cursor2;
END test_rep;
/
Basically there is a way to select from a table type in oracle using the TABLE() function
SELECT * FROM table(va_novated_trades);
But this works only for schema table types and on plsql tables (table types defined in the SCHEMA and not in a plsql package):
CREATE TYPE n_trade_rec AS OBJECT
(
NAME VARCHAR2(15),
ID NUMBER
);
CREATE TYPE ga_novated_trades AS TABLE OF n_trade_rec;
But I still think you should try to do it all in a query (and/or in the perl script),
For example, there is one field where i have to analyse the 4th
character and then edit other fields accordingly
This can be achieved in the query, could be something like:
select case when substr(one_field, 4, 1) = 'A' then 'A.' || sec_field
when substr(one_field, 4, 1) = 'B' then 'B.' || sec_field
else sec_field
end as new_sec_field,
case when substr(one_field, 4, 1) = 'A' then 100 * trd_field
when substr(one_field, 4, 1) = 'B' then 1000 * trd_field
else trd_field
end as new_trd_field,
-- and so on
from TESTREPORT

Resources