I have below kind of script to execute which is taking approx 2 hours to update only 23% of records from the table. table has approx 170K records with 150 columns.
DECLARE
V_EMP_SAL NUMBER;
BEGIN
For EmpIDs in (SELECT EmpID From Employees Where Emp_Address is null and Emp_phone is null and Emp_active='Y')
LOOP
SELECT SUM(EMP_SAL) into V_EMP_SAL FROM Employees
where Emp_ID=EmpIDs.EmpID and Emp_Active='Y';
UPDATE Employees
Set EMP_SAL_HIKE = SAL- V_EMP_SAL
WHERE EMP_ID=EmpIDs.EmpID;
COMMIT;
END LOOP;
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('ERROR OCCURED');
END;
Despite of writing this query in single update like below which is also taking similar time.
UPDATE Employees E1
Set E1.EMP_SAL_HIKE = E1.SAL- (SELECT SUM(EMP_SAL) FROM Employees
where Emp_ID=E1.EmpID and Emp_Active='Y')
WHERE E1.Emp_Address is null
AND E1.Emp_phone is null
AND E1.Emp_active='Y';
my question is
Why is it taking this much of time.
how to know which part is taking more time.
how to optimize this so that this gets executed very fast.
Thanks in advance.
Row-by-row (in a loop) is slow-by-slow.
Your best option is to skip PL/SQL and loop completely and switch to a single update (or merge) statement, e.g.
merge into employees a
using (select emp_id,
sum(emp_sal) v_emp_sal
from employees
where emp_address is null
and emp_phone is null
and emp_active = 'Y'
group by emp_id
) x
on (e.emp_id = x.emp_id)
when matched then update set
a.emp_sal_hike = a.sal - x.v_emp_sal;
Related
I have created a stored procedure which is taking too much of time to update the columns of the table. Say 3 hrs to update 2.5k records out of 43k records.
So can I reduce the time of updating the records. Below is my logic for the same.
procedure UPDATE_MST_INFO_BKC
(
P_SAPID IN NVARCHAR2
)
as
v_cityname varchar2(500):='';
v_neid varchar2(500):='';
v_latitude varchar2(500):='';
v_longitude varchar2(500):='';
v_structuretype varchar2(500):='';
v_jc_name varchar2(500):='';
v_jc_code varchar2(500):='';
v_company_code varchar2(500):='';
v_cnt number :=0;
begin
select count(*) into v_cnt from structure_enodeb_mapping where RJ_SAPID=P_SAPID and rownum=1;
if v_cnt > 0 then
begin
select RJ_CITY_NAME, RJ_NETWORK_ENTITY_ID,LATITUDE,LONGITUDE,RJ_STRUCTURE_TYPE,RJ_JC_NAME,RJ_JC_CODE,'6000'
into v_cityname,v_neid,v_latitude, v_longitude, v_structuretype,v_jc_name,v_jc_code,v_company_code from structure_enodeb_mapping where RJ_SAPID=P_SAPID and rownum=1;
update tbl_ipcolo_mast_info set
CITY_NAME = v_cityname,
NEID = v_neid,
FACILITY_LATITUDE = v_latitude,
FACILITY_LONGITUDE = v_longitude,
RJ_STRUCTURE_TYPE = v_structuretype,
RJ_JC_NAME = v_jc_name,
RJ_JC_CODE = v_jc_code,
COMPANY_CODE = v_company_code
where SAP_ID=P_SAPID;
end;
end if;
end UPDATE_MST_INFO_BKC;
What adjustments can I make to this?
As far as I understand your code, It is updating TBL_IPCOLO_MAST_INFO having SAP_ID = P_SAPID Means It is updating one record and you must be calling the procedure for each record.
It is a good practice of calling the procedure once and update all the record in one go. (In your case 2.5k records must be updated in one call of this procedure only)
For your requirement, Currently, I have updated the procedure code to only execute MERGE statement, which will be same as multiple SQLs in your question for single P_SAPID.
PROCEDURE UPDATE_MST_INFO_BKC (
P_SAPID IN NVARCHAR2
) AS
BEGIN
MERGE INTO TBL_IPCOLO_MAST_INFO I
USING (
SELECT
RJ_CITY_NAME,
RJ_NETWORK_ENTITY_ID,
LATITUDE,
LONGITUDE,
RJ_STRUCTURE_TYPE,
RJ_JC_NAME,
RJ_JC_CODE,
'6000' AS COMPANY_CODE,
RJ_SAPID
FROM
STRUCTURE_ENODEB_MAPPING
WHERE
RJ_SAPID = P_SAPID
AND ROWNUM = 1
)
O ON ( I.SAP_ID = O.RJ_SAPID )
WHEN MATCHED THEN
UPDATE SET I.CITY_NAME = O.RJ_CITY_NAME,
I.NEID = O.RJ_NETWORK_ENTITY_ID,
I.FACILITY_LATITUDE = O.LATITUDE,
I.FACILITY_LONGITUDE = O.LONGITUDE,
I.RJ_STRUCTURE_TYPE = O.RJ_STRUCTURE_TYPE,
I.RJ_JC_NAME = O.RJ_JC_NAME,
I.RJ_JC_CODE = O.RJ_JC_CODE,
I.COMPANY_CODE = O.COMPANY_CODE;
END UPDATE_MST_INFO_BKC;
Cheers!!
3 hours? That's way too much. Are sap_id columns indexed? Even if they aren't, data set of 43K rows is just too small.
How do you call that procedure? Is it part of another code, perhaps some unfortunate loop which does something row-by-row (which is, in turn, slow-by-slow)?
A few objections:
are all those variables' datatypes really varchar2(500)? Consider declaring them so that they'd take table column's datatype, e.g. v_cityname structure_enodeb_mapping.rj_city_name%type;. Also, there's no need to explicitly say that their value is null (:= ''), it is so by default
select statement which checks whether there's something in the table for that parameter's value should be rewritten to use EXISTS as it should perform better than rownum = 1 condition you used.
also, consider using exception handlers (no-data-found if there's no row for a certain ID; too-many-rows if there are two or more rows)
select statement that collects data into variables has the same condition; do you really expect more than a single row for each ID (passed as a parameter)?
Anyway, the whole procedure's code can be shortened to a single update statement:
update tbl_ipcolo_mst_info t set
(t.city_name, t.neid, ...) = (select s.rj_city_name,
s.rj_network_entity_id, ...
from structure_enodeb_mapping s
where s.rj_sapid = t.sap_id
)
where t.sap_id = p_sapid;
If there is something to be updated, it will be. If there's no matching t.sap_id, nothing will happen.
My statement:
SELECT ROW_ID DATA_T WHERE CITY_ID=2000 AND IS_FREE=0 AND ROWNUM = 1
is used to retrieve the first row for a db table that has many entries with CITY_ID equal to 2000.
The ROW_ID that is returned is then used in an UPDATE statement in order to use this row and set IS_FREE=1.
That worked very well until two threads called the SELECT statement and the got the same ROW_ID obviously... That is my problem in a few words.
I am using ORACLE DB (12.x)
How do I resolve the problem? Can I use FOR UPDATE in this case?
I want every "client" somehow to get a different row or at least lock on of them
Something like this
function get_row_id return number
as
cursor cur_upd is
SELECT ROW_ID FROM TB WHERE CITY_ID=2000 AND IS_FREE=0 AND ROWNUM = 1
FOR UPDATE SKIP LOCKED;
begin
for get_cur_upd in cur_upd
loop
update TB
set IS_FREE = 1
where ROW_ID = get_cur_upd.ROW_ID;
commit work;
return get_cur_upd.ROW_ID;
end loop;
return null;
end;
commit or not after update depends on your logic.
Also you can return row_id without update&commit and do it later outside func.
MERGE INTO ////////1 GFO
USING
(SELECT *
FROM
(SELECT facto/////rid,
p-Id,
PRE/////EDATE,
RU//MODE,
cre///date,
ROW_NUMBER() OVER (PARTITION BY facto/////id ORDER BY cre///te DESC) col
FROM ///////////2
) x
WHERE x.col = 1) UFD
ON (GFO.FACTO-/////RID=UFD.FACTO////RID)
WHEN MATCHED THEN UPDATE
SET
GFO.PRE////DATE=UFD.PRE//////DATE
WHERE UFD.CRE/////DATE IS NOT NULL
AND UFD.RU//MODE= 'S'
AND GFO.P////ID=:2
hi every1, my above merge statement is taking too long , it has to run 40 times on table 1 using table2 each having 4millions plus records, for 40 different p--id, please suggest more efficient way as currently its taking 40+ minutes.
its updating only one colummn using a column from table2.t
i am unable to execute the query, its returning
Error: cannot fetch last explain plan from PLAN_TABLE
EXPLAIN PLAN IMAGE
HERE IS THE SCREENSHOT OF EXPLAIN PLAN
cost
The shown plan seems to by OK, the observed problem stems from the LOOP over P_ID that do not scale.
I assume you performs something like this (strongly simplified) - assuming the P_ID to be processed are in table TAB_PID
begin
for cur in (select p_id from tab_pid) loop
merge INTO tab1 USING tab2 ON (tab1.r_id = tab2.r_id)
WHEN MATCHED THEN
UPDATE SET tab1.col1=tab2.col1 WHERE p_id = cur.p_id;
end loop;
end;
/
HASH JOIN on large tables (in NO PARALLEL mode) with elapsed time 60 seconds is not a catastrophic result. But looping 40 times makes your 40 minutes.
So I'd sugesst to try to integrate the loop in the MERGE statement, without knowing details something like this (mayby you'll need also ajdust the MERGE JOIN condition).
merge INTO tab1 USING tab2 ON (tab1.r_id = tab2.r_id)
WHEN MATCHED THEN
UPDATE SET tab1.col1=tab2.col1
WHERE p_id in (select p_id from tab_pid);
I am recieving information from a csv file from one department to compare with the same inforation in a different department to check for discrepencies (About 3/4 of a million rows of data with 44 columns in each row). After I have the data in a table, I have a program that will take the data and send reports based on a HQ. I feel like the way I am going about this is not the most efficient. I am using oracle for this comparison.
Here is what I have:
I have a vb.net program that parses the data and inserts it into an extract table
I run a procedure to do a full outer join on the two tables into a new table with the fields in one department prefixed with '_c'
I run another procedure to compare the old/new data and update 2 different tables with detail and summary information. Here is code from inside the procedure:
DECLARE
CURSOR Cur_Comp IS SELECT * FROM T.AEC_CIS_COMP;
BEGIN
FOR compRow in Cur_Comp LOOP
--If service pipe exists in CIS but not in FM and the service pipe has status of retired in CIS, ignore the variance
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
--If there is not a summary record for this HQ in the table for this run, create one
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT compRow.HQ, to_date(sysdate, 'DD/MM/YYYY') from dual WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = compRow.HQ AND RUN_DATE = to_date(sysdate, 'DD/MM/YYYY'))
-- Check fields and update the tables accordingly
If (compRow.cis_loop <> compRow.cis_loop_c) Then
--Insert information into the details table
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl,
DateTime, Changed_Field, CIS_Value, FM_Value)
VALUES(compRow.Fac_ID, compRow.Pipe_Num, compRow.Hq, compRow.Street_Num || ' ' || compRow.Street_Name,
'Y', sysdate, 'Cis_Loop', compRow.cis_loop, compRow.cis_loop_c);
-- Update information into the summary table
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq = compRow.Hq
AND Run_Date = to_date(sysdate, 'DD/MM/YYYY')
End If;
END LOOP;
END;
Any suggestions of an easier way of doing this rather than an if statement for all 44 columns of the table? (This is run once a week if it matters)
Update: Just to clarify, there are 88 columns of data (44 of duplicates to compare with one suffixed with _c). One table lists each field in a row that is different so one row can mean 30+ records written in that table. The other table keeps tally of the number of discrepencies for each week.
First of all I believe that your task can be implemented (and should be actually) with staight SQL. No fancy cursors, no loops, just selects, inserts and updates. I would start with unpivotting your source data (it is not clear if you have primary key to join two sets, I guess you do):
Col0_PK Col1 Col2 Col3 Col4
----------------------------------------
Row1_val A B C D
Row2_val E F G H
Above is your source data. Using UNPIVOT clause we convert it to:
Col0_PK Col_Name Col_Value
------------------------------
Row1_val Col1 A
Row1_val Col2 B
Row1_val Col3 C
Row1_val Col4 D
Row2_val Col1 E
Row2_val Col2 F
Row2_val Col3 G
Row2_val Col4 H
I think you get the idea. Say we have table1 with one set of data and the same structured table2 with the second set of data. It is good idea to use index-organized tables.
Next step is comparing rows to each other and storing difference details. Something like:
insert into diff_details(some_service_info_columns_here)
select some_service_info_columns_here_along_with_data_difference
from table1 t1 inner join table2 t2
on t1.Col0_PK = t2.Col0_PK
and t1.Col_name = t2.Col_name
and nvl(t1.Col_value, 'Dummy1') <> nvl(t2.Col_value, 'Dummy2');
And on the last step we update difference summary table:
insert into diff_summary(summary_columns_here)
select diff_row_id, count(*) as diff_count
from diff_details
group by diff_row_id;
It's just rough draft to show my approach, I'm sure there is much more details should be taken into account. To summarize I suggest two things:
UNPIVOT data
Use SQL statements instead of cursors
You have several issues in your code:
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
"cis_status_c" is not declared. Is it a variable or a column in AEC_CIS_COMP?
In case it is a column, just put the condition into the cursor, i.e. SELECT * FROM T.AEC_CIS_COMP WHERE not (compRow.pipe_num = '' AND cis_status_c = 'R')
to_date(sysdate, 'DD/MM/YYYY')
That's nonsense, you convert a date into a date, simply use TRUNC(SYSDATE)
Anyway, I think you can use three single statements instead of a cursor:
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT comp.HQ, trunc(sysdate)
from AEC_CIS_COMP comp
WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = comp.HQ AND RUN_DATE = trunc(sysdate));
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl, DateTime, Changed_Field, CIS_Value, FM_Value)
select comp.Fac_ID, comp.Pipe_Num, comp.Hq, comp.Street_Num || ' ' || comp.Street_Name, 'Y', sysdate, 'Cis_Loop', comp.cis_loop, comp.cis_loop_c
from T.AEC_CIS_COMP comp
where comp.cis_loop <> comp.cis_loop_c;
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq IN (Select Hq from T.AEC_CIS_COMP)
AND trunc(Run_Date) = trunc(sysdate);
They are not tested but they should give you a hint how to do it.
Query below takes 20 seconds to run. user_table has 40054 records. other_table has 14000 records
select count(a.user_id) from user_table a, other_table b
where a.user_id = b.user_id;
our restriction is that any query running more than 8 seconds gets killed...>_< I've ran explain plans, asked questions here but based on our restrictions I can not get this query to run in less than 8 secs. So I made a loop out of it.
begin
FOR i IN role_user_rec.FIRST .. role_user_rec.LAST LOOP
SELECT COUNT (a.user_id) INTO v_into FROM user_table a
WHERE TRIM(role_user_rec(i).user_id) = TRIM(a.user_id);
v_count := v_count + v_into;
END LOOP;
I know restrictions suck and this is not effecient way to do things but is there any other way to make this loop run faster?
Can you get around the loop? I agree with Janek, if the query itself takes too long you may have to do a different method to get it. And to agree with Mark, if you can do it in one query then by all means do so. But if you cannot, drop the loop as below
But try it something like this; drop the loop:
/*
--set up for demo/test
Create Type Testusertype As Object(User_Id Number , User_Name Varchar2(500));
CREATE TYPE TESTUSERTYPETABLE IS TABLE OF TESTUSERTYPE;
*/
Declare
Tutt Testusertypetable;
TOTALCOUNT NUMBER ;
Begin
Select Testusertype(Object_Id,Object_Name)
bulk collect into TUTT
From User_Objects
;
Dbms_Output.Put_Line(Tutt.Count);
Select Count(*) Into Totalcount
From User_Objects Uu
Inner Join Table(Tutt) T
ON T.User_Id = Uu.Object_Id;
Dbms_Output.Put_Line(Tutt.Count);
Dbms_Output.Put_Line(Totalcount);
End ;