Oracle finding duplicate rows that are consecutive for a given field

Oracle finding duplicate rows that are consecutive for a given field - oracle

We have some tables in an Oracle 19c database that were put in place many years ago for audit purposes. This 'audit' table is being generated by a trigger on update/delete to another table "widget". It is not checking to see if the new data matches the old data, so there are a lot of rows that are identical in everything but timestamp. I am not looking for comments on fixing the auditing as there many solutions to that mess. We are currently just stopping the bleeding until a better solution can be implemented.
My question is how to cut my existing audit data down to only the records that actually changed. These tables have grown to the point that it is impacting some of our processes.
auditWidget table data:
dataRowID
widgetID
widgetDesc
timestampOfTableUpdate
1
1
A
1/29/2016 12:12:55 PM
2
2
test
1/29/2016 12:13:55 PM
3
1
A
1/29/2016 12:14:55 PM
4
1
A
1/29/2016 12:15:55 PM
5
2
test
1/29/2016 12:16:55 PM
6
1
c
1/29/2016 12:17:55 PM
7
1
b
1/29/2016 12:18:55 PM
8
2
d
1/29/2016 12:19:55 PM
9
1
A
1/29/2016 12:20:55 PM
As you can see, nothing has changed between row 1,3 and 4, or between 2 and 5. These rows should never have been created. I am trying to find a way to identify/delete the entries that don't show any change in the widget table. My final data would look like this:
dataRowID
widgetID
widgetDesc
timestampOfTableUpdate
1
1
A
1/29/2016 12:12:55 PM
2
2
b
1/29/2016 12:13:55 PM
6
1
c
1/29/2016 12:17:55 PM
7
1
b
1/29/2016 12:18:55 PM
8
2
d
1/29/2016 12:19:55 PM
9
1
A
1/29/2016 12:20:55 PM
LAG only looks at the previous row which might contain a different widgetID. There are thousands of widgetIDs. There is also potentially many identical rows per widgetID in a row. Are there any solutions out there short of looping through the data and conditionally doing something with it row by row? Any direction would be appreciated.
SQL to create data for testing:
CREATE TABLE AUDITWIDGET
(
"DATAROWID" NUMBER(,0),
"WIDGETID" NUMBER(,0),
"WIDGETDESC" VARCHAR2(30),
"TIMESTAMPOFTABLEUPDATE" DATE);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (1,1,'A',sysdate - interval '100' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (2,2,'test',sysdate - interval '99' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (3,1,'A',sysdate - interval '98' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (4,1,'A',sysdate - interval '97' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (5,2,'test',sysdate - interval '96' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (6,1,'c',sysdate - interval '95' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (7,1,'b',sysdate - interval '94' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (8,1,'d',sysdate - interval '93' minute);
Insert into AUDITWIDGET (datarowID,widgetID,widgetDesc,timeStampOfTableUpdate) values (9,1,'A',sysdate - interval '92' minute);

The simple solution to only SELECT the relevant data is to use LAG(). For example:
select *,
from (
select t.*,
lag(widgetdesc) over(
partition by widgetid
order by timestampOfTableUpdate
) as prev_desc
from my_table t
) x
where widgetdesc <> prev_desc

Delete duplicates, then:
delete from auditwidget a
where a.rowid > (select min(b.rowid)
from auditwidget b
where b.widgetid = a.widgetid
and b.widgetdesc = a.widgetdesc
);

Related

Where clause from a subquery

I have a table with business days BUSINESS_DAYS which has all the dates
I have another table with payment information and DUE_DATES
I want to return in my query the next business day IF the DUE_DATE is not a business day
SELECT SQ1.DUE_DATE, SQ2.DATE FROM
(SELECT * FROM
PAYMENTS
ORDER BY
DUE_DATE) SQ1,
(SELECT MIN(DATE) DATE FROM BUSINESS_DAYS WHERE SQ1.DUE_DATE <= DATE GROUP BY DATE) SQ2
Anyone can shed some light?

The way I see it, code you posted doesn't do what you wanted anyway (otherwise, you won't be asking a question at all). Therefore, I'd suggest another approach:
Altering the session (you don't have to do it; my database speaks Croatian so I'm switching to English; also, setting date format to display day name):
SQL> alter session set nls_date_language = 'english';
Session altered.
SQL> alter session set nls_date_format = 'dd.mm.yyyy, dy';
Session altered.
Two CTEs contain
business_days: as commented, only this year's July, weekends excluded, there are no holidays)
payments: two rows, one whose due date is a working day and another whose isn't
Sample data end at line #15, query you might be interested in begins at line #16. Its CASE expression check whether due_date is one of weekend days; if not, due date to be returned is exactly what it is. Otherwise, another SELECT statement returns the first (MIN) business day larger than due_date.
SQL> with
2 business_days (datum) as
3 -- for simplicity, only all dates in this year's July,
4 -- weekends excluded (as they aren't business days), no holidays
5 (select date '2021-07-01' + level - 1
6 from dual
7 where to_char(date '2021-07-01' + level - 1, 'dy')
8 not in ('sat', 'sun')
9 connect by level <= 31
10 ),
11 payments (id, due_date) as
12 (select 1, date '2021-07-14' from dual -- Wednesday, business day
13 union all
14 select 2, date '2021-07-25' from dual -- Sunday, non-business day
15 )
16 select p.id,
17 p.due_date current_due_date,
18 --
19 case when to_char(p.due_date, 'dy') not in ('sat', 'sun') then
20 p.due_date
21 else (select min(b.datum)
22 from business_days b
23 where b.datum > p.due_date
24 )
25 end new_due_date
26 from payments p
27 order by id;
ID CURRENT_DUE_DAT NEW_DUE_DATE
---------- --------------- ---------------
1 14.07.2021, wed 14.07.2021, wed --> Wednesday remains "as is"
2 25.07.2021, sun 26.07.2021, mon --> Sunday switched to Monday
SQL>

Training schedule till end of year

I have a schedule of my training, three times a week, for example -
MON, WED,FRI. I need to generate records for my schedule table with dates till the end of the current year when I have training.
The schedule table is:
CREATE TABLE trainingSchedule (
id NUMBER,
training_date DATE
);
If training date already exist - don’t insert a record.

Here's one option. Read comments within code.
SQL> CREATE TABLE trainingSchedule
2 (id NUMBER,
3 training_date DATE
4 );
Table created.
SQL> create sequence seq_tra;
Sequence created.
SQL> -- initial insert (just to show that MERGE will skip it
SQL> insert into trainingschedule values (seq_tra.nextval, date '2021-03-22');
1 row created.
MERGE will skip rows that are already inserted. I understood that you want to insert only dates that follow today's date; if that's not so, just remove the last condition.
SQL> merge into trainingschedule t
2 using (-- this is a calendar for current year
3 select trunc(sysdate, 'yyyy') + level - 1 datum
4 from dual
5 connect by level <= add_months(trunc(sysdate, 'yyyy'), 12) - trunc(sysdate, 'yyyy')
6 ) c
7 on (c.datum = t.training_date)
8 when not matched then insert (id, training_date) values (seq_tra.nextval, c.datum)
9 -- insert only Mondays, Wednesdays and Fridays
10 where to_char(c.datum, 'dy', 'nls_date_language = english') in ('mon', 'wed', 'fri')
11 -- insert only dates that follow today's date ("till the end of the current year")
12 and datum >= trunc(sysdate);
122 rows merged.
SQL>
What's in there?
SQL> select id,
2 to_char(training_date, 'dd.mm.yyyy, dy', 'nls_date_language = english') tr_date
3 from trainingschedule
4 order by training_date;
ID TR_DATE
---------- ------------------------
1 22.03.2021, mon --> see? No duplicates
311 24.03.2021, wed
309 26.03.2021, fri
207 29.03.2021, mon
354 31.03.2021, wed
321 02.04.2021, fri
<skip>

Can't figure out cursor for loop in a plsql procedure

I am trying to write a procedure in plsql that takes in two parameters, month and year. The procedure generates data for a table - loanreport. The procedure generates for all loan types in the loan type table. When the procedure is run it should populate the table with:
Month
Year
closed loan amount (sum of loan amounts of loans with status = 6) if no loans have status = 6 then closed loan amount is 0.
4.Average Closing Period
Here are the pertinent tables:
CREATE TABLE LOANDETAILS
(LOANNO VARCHAR2(11) primary key,
PROPERTYID VARCHAR2(10),
CUSTID CHAR(8),
LOANTYPE VARCHAR2(20),
LOANSTATUSCODE NUMBER(3,0),
LOANAMOUNT NUMBER(10,2),
RATE NUMBER(5,2),
LOANCREATIONDATE DATE,
LOANSTATUSDATE DATE,
constraint loandet_prop_fk foreign key(PROPERTYID) references PROPERTIES(propertyid),
constraint loandet_cust_fk foreign key(CUSTID) references customers(custid),
constraint loandet_lt_fk foreign key (LOANTYPE) references loantypes(loantype)
);
--insert
Insert into LOANDETAILS values ('L1000000001','P1000001','C1000001','Conventional',1,87975,9,to_date('26-JUL-2016','DD-MON-YY'),to_date('02-AUG-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000009','P1000009','C1000009','FHA',6,160055,4.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('07-DEC-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000010','P1000010','C1000010','VA',2,217600,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('07-DEC-2016','DD-MON-YYYY'));
CREATE TABLE LOANTYPES
(ltID char(5) constraint loantypes_pk primary key,
loantype VARCHAR2(20) constraint loantypes_lt_unique UNIQUE,
description VARCHAR2(100),
active char(1) constraint loantypes_active CHECK (active IN ('Y','N')) -- if loan type is currently being offered
);
Insert into loantypes values ('LT001', 'VA', 'Service members, veterans or eligible family','Y');
Insert into loantypes values ('LT002', 'FHA', 'Federal Housing Administration eligible loans', 'Y');
Insert into loantypes values ('LT003', 'Conventional', 'Standard loan','Y');
Insert into loantypes values ('LT004', 'Employee', 'Eligible employees of the organization','Y');
Insert into loantypes values ('LT005', 'Reconstruct', 'Relief work reconstruction','N');
CREATE TABLE LOANTYPEREPORT
(LOANTYPE VARCHAR2(20),
MONTH number(2,0),
YEAR NUMBER(4,0),
CLOSEDLOANSAMOUNT NUMBER(15,2),
AVERAGECLOSINGPERIOD NUMBER(5,2),
constraint loantr_pk PRIMARY KEY (LOANTYPE, RMONTH, RYEAR)
);
I am new to sql and I clearly have some knowledge gaps. I am attempting to create a procedure then would like to make a cursor and iterate over the cursor with a for loop to create the desired report. Here is my incomplete code:
CREATE OR REPLACE PROCEDURE loan_type_report_procedure (Month loantypereport.month%type, Year loantypereport.year%type) AS
CURSOR C1 IS
SELECT l.loantype,
loanamount,
loancreationdate,
loanstatusdate,
loanstatuscode,
to_char(LOANCREATIONDATE, 'mm') AS rMonth,
to_char(LOANCREATIONDATE, 'YYYY') AS rYEAR
FROM LOANTYPES l
JOIN LOANDETAILS d
ON l.loantype = d.loantype
WHERE Month = to_char(LOANCREATIONDATE, 'mm')
AND Year = to_char(LOANCREATIONDATE, 'YYYY')
BEGIN
FOR loan_rec in C1 LOOP
As far as my understanding goes for loop goes row by row in the cursor. If I want a final table loan type report that contains loantype, month, year, closed loan amount, and average closing period - how do I make that work? Closed loan amount and average closing period are both aggregated on the loan type grouping. Would I use a group by and having in the cursor select statement?
Thank you for your insight

If I understand your question correctly, you'd like to summarize the number of loans by their LOANTYPE, for each month, and for the loans that were closed (LOANSTATUSCODE = 6), you'd like to SUM their amount and record an AVERAGE of their loan time span.
From the look of LOANTYPEREPORT, it looks like you are planning for this average span to be in a number of days.
To accomplish this, you do not need to use PL/SQL. This can be done with traditional SQL. I'm only guessing at what criteria go into deciding a loan's duration, so I'll outline an example below with a couple variations.
In this example, I needed to modify your tables a little, since they reference other tables not included in your post (I dropped LOANDET_PROP_FK and LOANDET_CUST_FK).
After creating your example tables, create the LOANTYPEREPORT table (modified slightly from your example to compile):
CREATE TABLE LOANTYPEREPORT
(LOANTYPE VARCHAR2(20),
MONTH NUMBER(2,0),
YEAR NUMBER(4,0),
CLOSEDLOANSAMOUNT NUMBER(15,2),
AVERAGECLOSINGPERIOD NUMBER(5,2),
CONSTRAINT LOANTR_PK PRIMARY KEY (LOANTYPE, MONTH, YEAR)
);
Also, I added a little extra data for some FHA and Conventional loans, to help differentiate the SUMs and durations in the examples below.
Insert into LOANDETAILS values ('L1000000001','P1000001','C1000001','Conventional',1,87975,9,to_date('26-JUL-2016','DD-MON-YY'),to_date('02-AUG-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000009','P1000009','C1000009','FHA',6,160055,4.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('07-DEC-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000010','P1000010','C1000010','VA',2,217600,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('07-DEC-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000011','P1000010','C1000010','VA',6,217600,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('07-DEC-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000012','P1000010','C1000010','VA',6,111111,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('17-DEC-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000013','P1000010','C1000010','VA',2,222222,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-DEC-2016','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000014','P1000010','C1000010','Conventional',6,333333,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-JAN-2017','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000015','P1000010','C1000010','Conventional',5,333333,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-FEB-2017','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000016','P1000010','C1000010','Conventional',4,333333,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-MAR-2017','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000017','P1000010','C1000010','FHA',4,444444,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-APR-2017','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000018','P1000010','C1000010','FHA',6,200000,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-APR-2017','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000019','P1000010','C1000010','FHA',6,300000,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('27-MAY-2017','DD-MON-YYYY'));
Insert into LOANDETAILS values ('L1000000020','P1000010','C1000010','FHA',6,300000,7.5,to_date('30-NOV-2016','DD-MON-YYYY'),to_date('22-MAY-2017','DD-MON-YYYY'));
Then, you can load the report table.
Example 1 This assumes that only closed loans should be included in the loan-duration average, that the loan-duration is between LOANCREATIONDATE and LOANSTATUSDATE, and that you only want data for months and loan-types where loans were actually closed. This means july 2016 will not be included at all, since no loans were closed in that month.
INSERT INTO LOANTYPEREPORT
SELECT
LOANDETAILS.LOANTYPE,
EXTRACT(MONTH FROM LOANDETAILS.LOANSTATUSDATE) AS MONTH,
EXTRACT(YEAR FROM LOANDETAILS.LOANSTATUSDATE) AS YEAR,
SUM(LOANDETAILS.LOANAMOUNT) AS CLOSED_LOAN_AMOUNT,
AVG(LOANDETAILS.LOANSTATUSDATE - LOANDETAILS.LOANCREATIONDATE) AS AVERAGE_LOAN_DURATION
FROM LOANDETAILS
WHERE LOANDETAILS.LOANSTATUSCODE = 6
GROUP BY
LOANDETAILS.LOANTYPE,
EXTRACT(YEAR FROM LOANDETAILS.LOANSTATUSDATE),
EXTRACT(MONTH FROM LOANDETAILS.LOANSTATUSDATE);
Then see what it did:
SELECT YEAR,MONTH,LOANTYPE,CLOSEDLOANSAMOUNT,AVERAGECLOSINGPERIOD
FROM LOANTYPEREPORT
ORDER BY YEAR, MONTH, LOANTYPE;
YEAR MONTH LOANTYPE CLOSEDLOANSAMOUNT AVERAGECLOSINGPERIOD
2016 12 FHA 160055 7
2016 12 VA 328711 12
2017 1 Conventional 333333 58
2017 4 FHA 200000 148
2017 5 FHA 600000 175.5
Example 2: But if you want to include data for months where no loans were closed for a given loan-type (I'm not sure but this might be the third item in your post), then you'll need to enumerate the months.
You can do this several ways, but I'll just include an extra query in this example that sets boundaries for the for 2016 - 2018.
Run the insert:
INSERT INTO LOANTYPEREPORT
WITH YEAR_MONTH AS(
SELECT THE_MONTH.MONTH_NUMBER,
THE_YEAR.YEAR_NUMBER
FROM
(SELECT LEVEL AS MONTH_NUMBER FROM DUAL CONNECT BY LEVEL < 13) THE_MONTH
CROSS JOIN
(SELECT YEAR_NUMBER FROM
(SELECT LEVEL AS YEAR_NUMBER FROM DUAL CONNECT BY LEVEL < 2019)
WHERE YEAR_NUMBER BETWEEN 2016 AND 2018) THE_YEAR
)
SELECT
LOANTYPES.LOANTYPE,
YEAR_MONTH.MONTH_NUMBER,
YEAR_MONTH.YEAR_NUMBER,
SUM(COALESCE(CLOSED_LOAN.LOANAMOUNT,0)) AS CLOSED_LOAN_AMOUNT,
AVG(CLOSED_LOAN.LOAN_DURATION) AS AVERAGE_LOAN_DURATION
FROM
YEAR_MONTH
CROSS JOIN LOANTYPES
LEFT OUTER JOIN (SELECT EXTRACT(MONTH FROM LOANDETAILS.LOANSTATUSDATE) AS MONTH,
EXTRACT(YEAR FROM LOANDETAILS.LOANSTATUSDATE) AS YEAR,
LOANDETAILS.LOANTYPE,
LOANDETAILS.LOANAMOUNT,
LOANDETAILS.LOANSTATUSDATE - LOANDETAILS.LOANCREATIONDATE AS LOAN_DURATION
FROM LOANDETAILS
WHERE LOANDETAILS.LOANSTATUSCODE = 6) CLOSED_LOAN
ON YEAR_MONTH.YEAR_NUMBER = CLOSED_LOAN.YEAR
AND YEAR_MONTH.MONTH_NUMBER = CLOSED_LOAN.MONTH
AND LOANTYPES.LOANTYPE = CLOSED_LOAN.LOANTYPE
GROUP BY YEAR_NUMBER, MONTH_NUMBER, LOANTYPES.LOANTYPE
ORDER BY YEAR_NUMBER, MONTH_NUMBER, LOANTYPES.LOANTYPE;
And see what data is generated:
YEAR MONTH LOANTYPE CLOSEDLOANSAMOUNT AVERAGECLOSINGPERIOD
2016 1 Conventional 0
2016 1 Employee 0
2016 1 FHA 0
2016 1 Reconstruct 0
2016 1 VA 0
2016 2 Conventional 0
...
...
...
2016 12 Conventional 0
2016 12 Employee 0
2016 12 FHA 160055 7
2016 12 Reconstruct 0
2016 12 VA 328711 12
2017 1 Conventional 333333 58
2017 1 Employee 0
2017 1 FHA 0
2017 4 FHA 200000 148
2017 4 Reconstruct 0
2017 4 VA 0
2017 5 Conventional 0
2017 5 Employee 0
2017 5 FHA 600000 175.5
2017 5 Reconstruct 0
2017 5 VA 0
2017 6 Conventional 0
2017 6 Employee 0
2017 6 FHA 0
This way you get sum of closing amounts for loans closed each month of each type, or zero if none were closed.
EDIT with an example of returning from a function.
Just to reiterate, you do not need to use a function to do this kind of reporting.
But if you have a requirement to use a function, here is an example:
First, create your return type:
CREATE TYPE LOAN_TYPE_MONTH_REPORT IS OBJECT (
LOANTYPE VARCHAR2(20),
MONTH number(2,0),
YEAR NUMBER(4,0),
CLOSEDLOANSAMOUNT NUMBER(15,2),
AVERAGECLOSINGPERIOD NUMBER(5,2)
);
/
CREATE TYPE LOAN_TYPE_MONTH_REPORT_LIST IS TABLE OF LOAN_TYPE_MONTH_REPORT;
/
Then create your function:
CREATE FUNCTION GET_LOAN_TYPE_REPORT_FOR_MONTH(P_YEAR IN NUMBER, P_MONTH IN NUMBER)
RETURN LOAN_TYPE_MONTH_REPORT_LIST
IS
V_MONTH_REPORT LOAN_TYPE_MONTH_REPORT_LIST;
BEGIN
SELECT
LOAN_TYPE_MONTH_REPORT(
THE_MONTH_YEAR.LOANTYPE,
THE_MONTH_YEAR.THE_MONTH,
THE_MONTH_YEAR.THE_YEAR,
COALESCE(CLOSED_LOAN_SUMMARY.CLOSED_LOAN_AMOUNT,0),
CLOSED_LOAN_SUMMARY.AVERAGE_LOAN_DURATION)
BULK COLLECT INTO V_MONTH_REPORT
FROM
(SELECT P_YEAR AS THE_YEAR, P_MONTH AS THE_MONTH, LOANTYPES.LOANTYPE FROM LOANTYPES) THE_MONTH_YEAR
LEFT OUTER JOIN
(SELECT
LOANDETAILS.LOANTYPE,
EXTRACT(MONTH FROM LOANDETAILS.LOANSTATUSDATE) AS THE_MONTH,
EXTRACT(YEAR FROM LOANDETAILS.LOANSTATUSDATE) AS THE_YEAR,
SUM(LOANDETAILS.LOANAMOUNT) AS CLOSED_LOAN_AMOUNT,
AVG(LOANDETAILS.LOANSTATUSDATE - LOANDETAILS.LOANCREATIONDATE) AS AVERAGE_LOAN_DURATION
FROM LOANDETAILS
WHERE LOANDETAILS.LOANSTATUSCODE = 6
GROUP BY
LOANDETAILS.LOANTYPE,
EXTRACT(YEAR FROM LOANDETAILS.LOANSTATUSDATE),
EXTRACT(MONTH FROM LOANDETAILS.LOANSTATUSDATE)) CLOSED_LOAN_SUMMARY
ON THE_MONTH_YEAR.THE_YEAR = CLOSED_LOAN_SUMMARY.THE_YEAR
AND THE_MONTH_YEAR.THE_MONTH = CLOSED_LOAN_SUMMARY.THE_MONTH
AND THE_MONTH_YEAR.LOANTYPE = CLOSED_LOAN_SUMMARY.LOANTYPE;
RETURN V_MONTH_REPORT;
END;
/
Then test it:
Here's a month with two loan-types with closings:
SELECT * FROM TABLE(GET_LOAN_TYPE_REPORT_FOR_MONTH(2016,12));
LOANTYPE MONTH YEAR CLOSEDLOANSAMOUNT AVERAGECLOSINGPERIOD
Employee 12 2016 0
VA 12 2016 328711 12
Reconstruct 12 2016 0
FHA 12 2016 160055 7
Conventional 12 2016 0
Or a month with just one loan-type with a closing:
SELECT * FROM TABLE(GET_LOAN_TYPE_REPORT_FOR_MONTH(2017,01));
LOANTYPE MONTH YEAR CLOSEDLOANSAMOUNT AVERAGECLOSINGPERIOD
Employee 1 2017 0
Reconstruct 1 2017 0
FHA 1 2017 0
VA 1 2017 0
Conventional 1 2017 333333 58

What is the Best way to Perform Bulk insert in oracle ?

With this, I mean Inserting millions of records in tables. I know how to insert data using loops but for inserting millions of data it won't be a good approach.
I have two tables
CREATE TABLE test1
(
col1 NUMBER,
valu VARCHAR2(30),
created_Date DATE,
CONSTRAINT pk_test1 PRIMARY KEY (col1)
)
/
CREATE TABLE test2
(
col2 NUMBER,
fk_col1 NUMBER,
valu VARCHAR2(30),
modified_Date DATE,
CONSTRAINT pk_test2 PRIMARY KEY (col2),
FOREIGN KEY (fk_col1) REFERENCES test1(col1)
)
/
Please suggest a way to insert some dummy records upto 1 million without loops.

As a fairly simplistic approach, which may be enough for you based on your comments, you can generate dummy data using a hierarchical query. Here I'm using bind variables to control how many are created, and to make some of the logic slightly clearer, but you could use literals instead.
First, parent rows:
var parent_rows number;
var avg_children_per_parent number;
exec :parent_rows := 5;
exec :avg_children_per_parent := 3;
-- create dummy parent rows
insert into test1 (col1, valu, created_date)
select level,
dbms_random.string('a', dbms_random.value(1, 30)),
trunc(sysdate) - dbms_random.value(1, 365)
from dual
connect by level <= :parent_rows;
which might generate rows like:
COL1 VALU CREATED_DA
---------- ------------------------------ ----------
1 rYzJBVI 2016-11-14
2 KmSWXfZJ 2017-01-20
3 dFSTvVsYrCqVm 2016-07-19
4 iaHNv 2016-11-08
5 AvAxDiWepPeONGNQYA 2017-01-20
Then child rows, which a random fk_col1 in the range generated for the parent:
-- create dummy child rows
insert into test2 (col2, fk_col1, valu, modified_date)
select level,
round(dbms_random.value(1, :parent_rows)),
dbms_random.string('a', dbms_random.value(1, 30)),
trunc(sysdate) - dbms_random.value(1, 365)
from dual
connect by level <= :parent_rows * :avg_children_per_parent;
which might generate:
select * from test2;
COL2 FK_COL1 VALU MODIFIED_D
---------- ---------- ------------------------------ ----------
1 2 AqRUtekaopFQdCWBSA 2016-06-30
2 4 QEczvejfTrwFw 2016-09-23
3 4 heWMjFshkPZNyNWVQG 2017-02-19
4 4 EYybXtlaFHkAYeknhCBTBMusGAkx 2016-03-18
5 4 ZNdJBQxKKARlnExluZWkHMgoKY 2016-06-21
6 3 meASktCpcuyi 2016-10-01
7 4 FKgmf 2016-09-13
8 3 JZhk 2016-06-01
9 2 VCcKdlLnchrjctJrMXNb 2016-05-01
10 5 ddL 2016-11-27
11 4 wbX 2016-04-20
12 1 bTfa 2016-06-11
13 4 QP 2016-08-25
14 3 RgmIahPL 2016-03-04
15 2 vhinLUmwLwZjczYdrPbQvJxU 2016-12-05
where the number of children varies for each parent:
select fk_col1, count(*) from test2 group by fk_col1 order by fk_col1;
FK_COL1 COUNT(*)
---------- ----------
1 1
2 3
3 3
4 7
5 1
To insert a million rows instead, just change the bind variables.
If you needed more of a relationship between the children and parents, e.g. so the modified date is always after the created date, you can modify the query; for example:
insert into test2 (col2, fk_col1, valu, modified_date)
select *
from (
select level,
round(dbms_random.value(1, :parent_rows)) as fk_col1,
dbms_random.string('a', dbms_random.value(1, 30)),
trunc(sysdate) - dbms_random.value(1, 365) as modified_date
from dual
connect by level <= :parent_rows * :avg_children_per_parent
) t2
where not exists (
select null from test1 t1
where t1.col1 = t2.fk_col1 and t1.created_date > t2.modified_date
);
You may also want non-midnight times (I set everything to midnight via the trunc() call, based on the column name being 'date' not 'datetime'), or some column values null; so this might just be a starting point for you.

Divide time period into partitions and calculate averages

I have to divide elapsed time of a process into slots of equal sizes and calculate number of rows inserted during each time slot.
Here is DDL of the table
ID number
insert_date_time date
Sample Data
101 Aug 1 2015 4:43:00 PM
931 Aug 1 2015 4:43:01 PM
The output I am looking for is as follows
Time Slot Rows Inserted
4:00 pm - 5:00 pm 103
5:00 pm - 6:00 pm 95
6:00 pm - 7:00 pm 643
( I have left out date portion for brevity )
Similarly, I have to find out how much time each 100 rows took
0 - 100 rows 4:00pm - 4:43pm
101 - 200 rows 4:43pm - 5:58pm
I know Oracle OLAP functions can be used for this, but not sure how ?

Hoping I understood your requirement.
First Req#:
SELECT TO_CHAR(INSERT_TIME,'HH24'),COUNT(*) FROM TABLE
GROUP BY TO_CHAR(INSERT_TIME,'HH24');
Second Req#:
SELECT ROUND((ROW_NUMBER-1)/100,0),
MIN(INSERT_TIME) +'-'+ MAX(INSERT_TIME)
FROM TABLE
GROUP BY ROUND((ROW_NUMBER-1)/100,0);

If you are looking for 1 hour partitions, you can simply truncate the date column you are looking at to the hour trunc(your_date_col, 'hour') using that truncated value in your grouping clause.
For the 2nd requirement you can partition your rows with a query like this:
with data as (
select a.your_date_col
, row_number() over (order by Your_Date_Col) rn
, row_number() over (order by Your_Date_Col) -
mod(row_number() over (order by Your_Date_Col)-1,50) grp
from your_table a
)
select min(rn) ||' - '
||max(rn) ||' rows' rows
, min(your_date_col)||' - '
||max(your_date_col) time_range
from data
group by grp

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Oracle finding duplicate rows that are consecutive for a given field - oracle

The simple solution to only SELECT the relevant data is to use LAG(). For example: select , from ( select t., lag(widgetdesc) over( partition by widgetid order by timestampOfTableUpdate ) as prev_desc from my_table t ) x where widgetdesc <> prev_desc

Delete duplicates, then: delete from auditwidget a where a.rowid > (select min(b.rowid) from auditwidget b where b.widgetid = a.widgetid and b.widgetdesc = a.widgetdesc );

Related

Where clause from a subquery

Training schedule till end of year

Can't figure out cursor for loop in a plsql procedure

What is the Best way to Perform Bulk insert in oracle ?

Divide time period into partitions and calculate averages

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Oracle finding duplicate rows that are consecutive for a given field - oracle

The simple solution to only SELECT the relevant data is to use LAG(). For example: select *, from ( select t.*, lag(widgetdesc) over( partition by widgetid order by timestampOfTableUpdate ) as prev_desc from my_table t ) x where widgetdesc <> prev_desc

Delete duplicates, then: delete from auditwidget a where a.rowid > (select min(b.rowid) from auditwidget b where b.widgetid = a.widgetid and b.widgetdesc = a.widgetdesc );

Related

Where clause from a subquery

Training schedule till end of year

Can't figure out cursor for loop in a plsql procedure

What is the Best way to Perform Bulk insert in oracle ?

Divide time period into partitions and calculate averages

Categories

Resources

The simple solution to only SELECT the relevant data is to use LAG(). For example: select , from ( select t., lag(widgetdesc) over( partition by widgetid order by timestampOfTableUpdate ) as prev_desc from my_table t ) x where widgetdesc <> prev_desc