I have to process a large table (2.5B records) row by row in order to keep track of two variables. As one can imagine, this is quite slow. I am looking for ideas on how to tune this procedure. Thank you.
declare
cursor c_data is select /* +index(data data_pk) */ * from data order by data_id;
r_data c_data%ROWTYPE;
lst_b_prc number(15,8);
lst_a_prc number(15,8);
begin
open c_data;
loop
fetch c_data into r_data;
exit when c_data%NOTFOUND;
if r_data.BATS = 'B' then
lst_b_prc := r_data.PRC;
end if;
if r_data.BATS = 'A' then
lst_a_prc := r_data.PRC;
end if;
if r_data.BATS = 'T' then
insert into trans .... lst_a_prc , lst_b_prc
end if;
end loop;
close c_data;
end;
The issue really comes down to finding efficient sql to track the latest PRC value when BATS='A' and BATS='B' for each BATS='T' record.
If I understand your problem correctly, with a table of data like this:
create table data as
select 1 data_id, 'T' bats, 1 prc from dual union all
select 2 data_id, 'A' bats, 2 prc from dual union all
select 3 data_id, 'B' bats, 3 prc from dual union all
select 4 data_id, 'T' bats, 4 prc from dual union all
select 5 data_id, 'A' bats, 5 prc from dual union all
select 6 data_id, 'T' bats, 6 prc from dual union all
select 7 data_id, 'B' bats, 7 prc from dual union all
select 8 data_id, 'T' bats, 8 prc from dual union all
select 9 data_id, 'T' bats, 9 prc from dual;
You you want to insert one row for each T, using the last PRC value for A and B. Which would look something like this:
T data_id Last A Last B
--------- ------ ------
1 null null
4 2 3
6 5 3
8 5 7
9 5 7
This query should work:
select data_id, last_A, last_B
from
(
select data_id, bats, prc
,max(case when bats = 'A' then prc else null end) over
(order by data_id
rows between unbounded preceding and current row) last_A
,max(case when bats = 'B' then prc else null end) over
(order by data_id
rows between unbounded preceding and current row) last_B
from data
)
where bats = 'T';
With so much data, you'll probably want to use direct path writes and parallelism.
The performance will largely depend on whether the sorting for the analytic functions can be done in memory or on disk. Optimizing memory can be very difficult, you'll probably need to work with a DBA to allow your process to use as much memory as possible without causing problems for other processes.
There are several options. Most importantly, you're probably keeping a huge UNDO/REDO log for all your inserts. You could occasionally commit your work, say every 1000 inserts.
Another option is to use a SQL MERGE statement (or simpler INSERT .. SELECT .. statement), that will allow your Oracle instance to operate on sets rather than on single records. The execution plan of your select might be optimised for optimal INSERT performance.
Related
What is the sql code to print 'Query' if the data in field = 'Q'?
Welcome to SO!
An example out of the box:
select decode(dummy, 'X', 'Y') from dual;
For your scenario, something like:
select decode(mycol, 'Q', 'Query') mycol from mytable;
Best of luck!
Preferred option is to use CASE because of readability; although, as #Bjarte suggested, DECODE can also be used (which I what I do, especially for simple cases). Also, tables have columns, not fields.
Anyway, CASE:
SQL> with test (field) as
2 -- sample data; you already have that and don't type it
3 (select 'A' from dual union all
4 select 'Q' from dual union all
5 select 'B' from dual
6 )
7 -- query you need
8 select field,
9 case when field = 'A' then 'Answer'
10 when field = 'Q' then 'Query'
11 else 'Unknown'
12 end as result
13 from test;
F RESULT
- -------
A Answer
Q Query
B Unknown
SQL>
I am populating a dimension table named TIMES with data from an OLTP Table called SALES with the following code:
CREATE TABLE TIMES
(saleDay DATE PRIMARY KEY,
dayType VARCHAR(50) NOT NULL);
BEGIN
FOR rec IN
(SELECT saleDate, CASE WHEN h.hd IS NOT NULL THEN 'Holiday'
WHEN to_char(saleDate, 'd') IN (1,7) THEN 'Weekend'
ELSE 'Weekday' END dayType
FROM SALES s LEFT JOIN
(SELECT '01.01' hd FROM DUAL UNION ALL
SELECT '15.01' FROM DUAL UNION ALL
SELECT '19.01' FROM DUAL UNION ALL
SELECT '28.05' FROM DUAL UNION ALL
SELECT '04.07' FROM DUAL UNION ALL
SELECT '08.10' FROM DUAL UNION ALL
SELECT '11.11' FROM DUAL UNION ALL
SELECT '22.11' FROM DUAL UNION ALL
SELECT '25.12' FROM DUAL) h
ON h.hd = TO_CHAR(s.saleDate, 'dd.mm'))
LOOP
INSERT INTO TIMES VALUES rec;
END LOOP;
END;
/
When I run this, I'm getting the errors ORA-00001 (Unique Constraint Violation) and ORA-06512. I believe this is happening because the code is trying to input multiple dates (some of which are the same) into PK for my TIMES Dimension Table (saleDay). How would I implement a clause into this loop so it will only populate one instance of each saleDate into the saleDay PK so there isn't a violation?
For instance, If there are three rows in the SALES table where the saleDate is 2015-10-10, the code should only populate ONE instance of 2015-10-10 into the saleDay PK. I'm thinking the direction I should head is to implement a WHILE clause, however I'm not 100% sure on how that would work since this code is also using CASE to determine whether the saleDay was a Holiday, Weekday, or Weekend and populating the result into the dayType column.
Adding DISTINCT as suggested in a Comment below your question is one way to solve the problem.
The following approach may be more efficient:
for rec in (select distinct saledate from sales)
loop
insert into times (saleday, daytype) values
(rec.saledate, CASE .......);
end loop;
That is: put the CASE expression in the INSERT statement, not in the definition of the (implicit) cursor. There is no reason to compute the CASE expression multiple times for the same date, which may appear many times in the SALES table. There is no reason for the CASE expression to be part of the cursor, either. The CASE expression can use an IN condition (case when to_char(rec.saledate, 'dd.mm') in ('01.01', '15.01', ....) then 'Holiday' when .......)
Unless, of course, the homework problem specifically instructs you to use a left outer join....... :-(
Adding DISTINCT resolved this. Originally thought DISTINCT would negatively impact the CASE but it doesn't. Thanks to I3rutt for pointing this out.
BEGIN
FOR rec IN
(SELECT DISTINCT saleDate, CASE WHEN h.hd IS NOT NULL THEN 'Holiday'
WHEN to_char(saleDate, 'd') IN (1,7) THEN 'Weekend'
ELSE 'Weekday' END dayType
FROM SALES s LEFT JOIN
(SELECT '01.01' hd FROM DUAL UNION ALL
SELECT '15.01' FROM DUAL UNION ALL
SELECT '19.01' FROM DUAL UNION ALL
SELECT '28.05' FROM DUAL UNION ALL
SELECT '04.07' FROM DUAL UNION ALL
SELECT '08.10' FROM DUAL UNION ALL
SELECT '11.11' FROM DUAL UNION ALL
SELECT '22.11' FROM DUAL UNION ALL
SELECT '25.12' FROM DUAL) h
ON h.hd = TO_CHAR(s.saleDate, 'dd.mm'))
LOOP
INSERT INTO TIMES VALUES rec;
END LOOP;
END;
/
I am trying to reverse a string without using REVERSE function. I came across one example which is something like:
select listagg(letter) within group(order by lvl)
from
(SELECT LEVEL lvl, SUBSTR ('hello', LEVEL*-1, 1) letter
FROM dual
CONNECT BY LEVEL <= length('hello'));
Apart from this approach,is there any other better approach to do this?
If you're trying to avoid the undocumented reverse() function you could use the utl_raw.reverse() function instead, with appropriate conversion too and from RAW:
select utl_i18n.raw_to_char(
utl_raw.reverse(
utl_i18n.string_to_raw('Some string', 'AL32UTF8')), 'AL32UTF8')
from dual;
UTL_I18N.RAW_TO_CHAR(UTL_RAW.REVERSE(UTL_I18N.STRING_TO_RAW('SOMESTRING','AL32UT
--------------------------------------------------------------------------------
gnirts emoS
So that is taking an original value; doing utl_i18n.string_to_raw() on that; then passing that to utl_raw.reverse(); then passing the result of that back through utl_i18n.raw_to_char().
Not entirely sure how that will cope with multibyte characters, or what you'd want to happen to those anyway...
Or a variation from the discussion #RahulTripathi linked to, without the character set handling:
select utl_raw.cast_to_varchar2(utl_raw.reverse(utl_raw.cast_to_raw('Some string')))
from dual;
UTL_RAW.CAST_TO_VARCHAR2(UTL_RAW.REVERSE(UTL_RAW.CAST_TO_RAW('SOMESTRING')))
--------------------------------------------------------------------------------
gnirts emoS
But that thread also notes it only works for single-byte characters.
You could do it like this:
with strings as (select 'hello' str from dual union all
select 'fred' str from dual union all
select 'this is a sentance.' from dual)
select str,
replace(sys_connect_by_path(substr (str, level*-1, 1), '~|'), '~|') rev_str
from strings
where connect_by_isleaf = 1
connect by prior str = str --added because of running against several strings at once
and prior sys_guid() is not null --added because of running against several strings at once
and level <= length(str);
STR REV_STR
------------------- --------------------
fred derf
hello olleh
this is a sentance. .ecnatnes a si siht
N.B. I used a delimiter of ~| simply because that's something unlikely to be part of your string. You need to supply a non-null delimiter to the sys_connect_by_path, hence why I didn't just leave it blank!
SELECT LISTAGG(STR) WITHIN GROUP (ORDER BY RN DESC)
FROM
(
SELECT ROWNUM RN, SUBSTR('ORACLE',ROWNUM,1) STR FROM DUAL
CONNECT BY LEVEL <= LENGTH('ORACLE')
);
You can try using this function:
SQL> ed
Wrote file afiedt.buf
1 with t as (select 'Reverse' as txt from dual)
2 select replace(sys_connect_by_path(ch,'|'),'|') as reversed_string
3 from (
4 select length(txt)-rownum as rn, substr(txt,rownum,1) ch
5 from t
6 connect by rownum <= length(txt)
7 )
8 where connect_by_isleaf = 1
9 connect by rn = prior rn + 1
10* start with rn = 0
SQL> /
Source
select listagg(rev)within group(order by rownum)
from
(select substr('Oracle',level*-1,1)rev from dual
connect by level<=length('Oracle'));
Data in the file_name field of the generation table should be an assigned number, then _01, _02, or _03, etc. and then .pdf (example 82617_01.pdf).
Somewhere, the program is putting a state name and sometimes a date/time stamp, between the assigned number and the 01, 02, etc. (82617_ALABAMA_01.pdf or 19998_MAINE_07-31-2010_11-05-59_AM.pdf or 5485325_OREGON_01.pdf for example).
We would like to develop a SQL statement to find the bad file names and fix them. In theory it seems rather simple to find file names that include a varchar2 data type and remove it, but putting the statement together is beyond me.
Any help or suggestions appreciated.
Something like:
UPDATE GENERATION
SET FILE_NAME (?)
WHERE FILE_NAME (?...LIKE '%STRING%');?
You can find the problem rows like this:
select *
from Files
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1
You can fix them like this:
update Files
set FILE_NAME = SUBSTR(FILE_NAME, 1, instr(FILE_NAME, '_') -1) ||
SUBSTR(FILE_NAME, instr(FILE_NAME, '_', 1, 2))
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1
SQL Fiddle Example
You can also use Regexp_replace function:
SQL> with t1(col) as(
2 select '82617_mm_01.pdf' from dual union all
3 select '456546_khkjh_89kjh_67_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col
8 , regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3') res
9 from t1;
COL RES
-------------------------------------- -----------------------------------------
82617_mm_01.pdf 82617_01.pdf
456546_khkjh_89kjh_67_01.pdf 456546_01.pdf
19998_MAINE_07-31-2010_11-05-59_AM.pdf 19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf 5485325_01.pdf
To display good or bad data regexp_like function will come in handy:
SQL> with t1(col) as(
2 select '826170_01.pdf' from dual union all
3 select '456546_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col bad_data
8 from t1
9 where not regexp_like(col, '^[0-9]+_\d{2}\.pdf$');
BAD_DATA
--------------------------------------
19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf
SQL> with t1(col) as(
2 select '826170_01.pdf' from dual union all
3 select '456546_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col good_data
8 from t1
9 where regexp_like(col, '^[0-9]+_\d{2}\.pdf$');
GOOD_DATA
--------------------------------------
826170_01.pdf
456546_01.pdf
To that end your update statement might look like this:
update your_table
set col = regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3');
--where clause if needed
I wanted to see what some suggested approaches would be to validate a field that is stored as a CSV against a table containing appropriate values. Althought it would be desired, it is NOT an option to split the CSV list into another related table. In the example data below I would be trying to capture the code 99 for widget A.
Below is an example data representation.
Table: Widgets
WidgetName WidgetCodeList
A 1, 2, 3
B 1
C 2, 3
D 99
Table: WidgetCodes
WidgetCode
1
2
3
An earlier approach was to query the CSV column as rows using various string manipulations and CONNECT_BY_LEVEL however the performance was not acceptible.
You could try a pipelined function (here with a lateral join):
SQL> WITH widgets AS (
2 SELECT 'A' WidgetName, '1, 2, 3' WidgetCodeList FROM dual
3 UNION ALL SELECT 'B', '1' FROM DUAL
4 UNION ALL SELECT 'C', '2, 3' FROM DUAL
5 UNION ALL SELECT 'D', '99' FROM DUAL
6 ), widgetcodes AS (
7 SELECT ROWNUM widgetcode from dual CONNECT BY LEVEL <= 3
8 )
9 SELECT w.widgetname,
10 to_number(s.column_value) missing_widget
11 FROM widgets w
12 CROSS JOIN TABLE(demo_pkg.string_to_tab(w.WidgetCodeList)) s
13 WHERE NOT EXISTS (SELECT NULL
14 FROM widgetcodes ws
15 WHERE ws.widgetcode = to_number(s.column_value));
WIDGETNAME MISSING_WIDGET
---------- --------------
D 99
See this other SO for an example of a pipelined function that converts a character string to a table.
In PL/SQL, you could make use of the Apex utility for converting a delimited string to a PL/SQL collection like this:
procedure validate_csv (p_csv varchar2)
is
v_array apex_application_global.vc_arr2;
v_dummy varchar2(1);
begin
v_array := apex_util.string_to_table(p_csv, ', ');
for i in 1..v_array.count
loop
begin
select null
into v_dummy
from widgetcodes
where widgetcode = v_array(i);
exception
when no_data_found then
raise_application_error('Invalid widget code: '||v_array(i));
end;
end loop;
end;