Validating Column Data Stored as CSV Against Another Table - oracle

I wanted to see what some suggested approaches would be to validate a field that is stored as a CSV against a table containing appropriate values. Althought it would be desired, it is NOT an option to split the CSV list into another related table. In the example data below I would be trying to capture the code 99 for widget A.
Below is an example data representation.
Table: Widgets
WidgetName WidgetCodeList
A 1, 2, 3
B 1
C 2, 3
D 99
Table: WidgetCodes
WidgetCode
1
2
3
An earlier approach was to query the CSV column as rows using various string manipulations and CONNECT_BY_LEVEL however the performance was not acceptible.

You could try a pipelined function (here with a lateral join):
SQL> WITH widgets AS (
2 SELECT 'A' WidgetName, '1, 2, 3' WidgetCodeList FROM dual
3 UNION ALL SELECT 'B', '1' FROM DUAL
4 UNION ALL SELECT 'C', '2, 3' FROM DUAL
5 UNION ALL SELECT 'D', '99' FROM DUAL
6 ), widgetcodes AS (
7 SELECT ROWNUM widgetcode from dual CONNECT BY LEVEL <= 3
8 )
9 SELECT w.widgetname,
10 to_number(s.column_value) missing_widget
11 FROM widgets w
12 CROSS JOIN TABLE(demo_pkg.string_to_tab(w.WidgetCodeList)) s
13 WHERE NOT EXISTS (SELECT NULL
14 FROM widgetcodes ws
15 WHERE ws.widgetcode = to_number(s.column_value));
WIDGETNAME MISSING_WIDGET
---------- --------------
D 99
See this other SO for an example of a pipelined function that converts a character string to a table.

In PL/SQL, you could make use of the Apex utility for converting a delimited string to a PL/SQL collection like this:
procedure validate_csv (p_csv varchar2)
is
v_array apex_application_global.vc_arr2;
v_dummy varchar2(1);
begin
v_array := apex_util.string_to_table(p_csv, ', ');
for i in 1..v_array.count
loop
begin
select null
into v_dummy
from widgetcodes
where widgetcode = v_array(i);
exception
when no_data_found then
raise_application_error('Invalid widget code: '||v_array(i));
end;
end loop;
end;

Related

How to convert string value returned from oracle apex 20.1 multiselect item into comma separated numbers array

I have a multi select enabled select list. I want to use all the selected ids inside an IN () operator in pl/sql query. Selected values are returned as below,
"1","5","4"
I want to use em as numbers as below,
1,5,4
My query is like,
UPDATE EMPLOYEE SET EMPSTAT = 'Active' WHERE EMPID IN (:P500_EMPIDS);
This is the employee table:
SQL> select * from employee;
EMPID EMPSTAT
---------- --------
1 Inactive
2 Inactive
4 Inactive
5 Inactive
SQL>
This is a way to split comma-separated values into rows (not into a list of values you'd use in IN!). Note that:
line #3: REPLACE function replaces double quotes with an empty string
line #3: then it is split into rows using REGEXP_SUBSTR with help of hierarchical query
SQL> with test (col) as
2 (select '"1","5","4"' from dual)
3 select regexp_substr(replace(col, '"', ''), '[^,]+', 1, level) val
4 from test
5 connect by level <= regexp_count(col, ',') + 1;
VAL
--------------------
1
5
4
SQL>
Usually multiselect items have colon-separated values, e.g. 1:5:4. If that's really the case, regular expression would look like this:
regexp_substr(col, '[^:]+', 1, level) val
Use it in Apex as:
update employee e set
e.empstat = 'Active'
where e.empid in
(select regexp_substr(replace(:P1_ITEM, '"', ''), '[^,]+', 1, level)
from dual
connect by level <= regexp_count(:P1_ITEM, ',') + 1
);
Result is:
3 rows updated.
SQL> select * from employee order by empid;
EMPID EMPSTAT
---------- --------
1 Active
2 Inactive
4 Active
5 Active
SQL>
Try it.
Thanks for helping everyone.Please check this and tell me if anything is wrong. I found a solution as below,
DECLARE
l_selected APEX_APPLICATION_GLOBAL.VC_ARR2;
BEGIN
l_selected := APEX_UTIL.STRING_TO_TABLE(:P500_EMPIDS);
FOR i in 1 .. l_selected.count LOOP
UPDATE EMPLYEE SET EMPSTATUS = 'ACTIVE' WHERE EMPID = to_number(l_selected(i));
END LOOP;
END;
You can use the API apex_string for this. If you want to use the IN operator you'll have to use EXECUTE IMMEDIATE because you cannot use a concatenated string in an IN operator.
Instead what you could do is the following:
DECLARE
l_array apex_t_varchar2;
BEGIN
l_array := apex_string.split(p_str => :P500_EMPIDS, p_sep => ':');
FOR i IN 1..l_array.count LOOP
UPDATE EMPLOYEE SET EMPSTAT = 'Active' WHERE EMPID = l_array(i);
END LOOP;
END;
Explanation: convert the colon separated list of ids to a table of varchar2, then loop through the elements of that table.
Note that I'm using ":" as a separator, that is what apex uses for multi selects. If you need "," then change code above accordingly.
Note that you can use apex_string directly within an update statement, so the answer of Koen Lostrie could be modified to not need a loop:
UPDATE EMPLOYEE
SET EMPSTAT = 'Active'
WHERE EMPID IN (
select to_number(trim('"' from column_value))
from table(apex_string.split(:P500_EMPIDS,','))
);
Testcase:
with cte1 as (
select '"1","2","3"' as x from dual
)
select to_number(trim('"' from column_value))
from table(apex_string.split((select x from cte1),','))

How to select second split of column data from oracle database

I want to select the data from a Oracle table, whereas the table columns contains the data as , [ex : key,value] separated values; so here I want to select the second split i.e, value
table column data as below :
column_data
++++++++++++++
asper,worse
tincher,good
golder
null -- null values need to eliminate while selection
www,ewe
from the above data, desired output like below:
column_data
+++++++++++++
worse
good
golder
ewe
Please help me with the query
According to data you provided, here are two options:
result1: regular expressions one (get the 2nd word if it exists; otherwise, get the 1st one)
result2: SUBSTR + INSTR combination
SQL> with test (col) as
2 (select 'asper,worse' from dual union all
3 select 'tincher,good' from dual union all
4 select 'golder' from dual union all
5 select null from dual union all
6 select 'www,ewe' from dual
7 )
8 select col,
9 nvl(regexp_substr(col, '\w+', 1, 2), regexp_substr(col, '\w+', 1,1 )) result1,
10 --
11 nvl(substr(col, instr(col, ',') + 1), col) result2
12 from test
13 where col is not null;
COL RESULT1 RESULT2
------------ -------------------- --------------------
asper,worse worse worse
tincher,good good good
golder golder golder
www,ewe ewe ewe
SQL>

Oracle PL/SQL set column names using other table

I'm struggling with naming columns in one table, using a reference table containing these names. I'm sure it should be possible, but I can't seem to find the right solution or think of the correct logic to achieve it...
Situation: I got 2 tables, one with data, columns having descriptive names, and one with a translation of these column names to meaningful names (reference table, or 'codebook').
I'm looking for a way to return the data of the first table, with the names of the columns given in the second column of the second table.
Tables look like:
dataTable:
q1,q2,q3
1,2,3
4,5,6
and
translationTable:
descName, meanName
q1, meaning1
q2, meaning2
q3, meaning3
Result should be:
meaning1,meaning2,meaning3
1,2,3
4,5,6
Help would be highly appreciated!
You can not directly do it, because you need a query whose columns are variable, based on some value.
Slightly different, what you can do is build a dynamic SQL to have your query created by Oracle:
SETUP:
SQL> create table dataTable(q1,q2,q3) as
2 select 1,2,3 from dual union all
3 select 4,5,6 from dual
4 ;
Table created.
SQL> create table translationTable(descName, meanName) as
2 select 'q1', 'meaning1' from dual union all
3 select 'q2', 'meaning2' from dual union all
4 select 'q3', 'meaning3' from dual ;
Table created.
This will create and print your query:
SQL> declare
2 vSQL varchar2(1000);
3 begin
4 select listagg (column_name || ' AS "' || meanName || '"', ', ') within group (order by column_name)
5 into vSQL
6 from user_tab_columns col
7 inner join translationTable tr
8 on (upper(tr.descName) = col.column_name)
9 where table_name = upper('dataTable');
10 --
11 vSQL := 'select ' || vSQL || ' from dataTable';
12 dbms_output.put_line(vSQL);
13 end;
14 /
select Q1 AS "meaning1", Q2 AS "meaning2", Q3 AS "meaning3" from dataTable
PL/SQL procedure successfully completed.
If you copy the statement and run it:
SQL> select Q1 AS "meaning1", Q2 AS "meaning2", Q3 AS "meaning3" from dataTable;
meaning1 meaning2 meaning3
---------- ---------- ----------
1 2 3
4 5 6
SQL>
This way you have your query, but you can not fetch it, because it still has variable columns.
You can easily edit this code to make it build a query that returns strings, composed by concatenating the felds; this way you will always have a single field, but it's different from what you asked:
SQL> select 'meaning1, meaning2, meaning3' from dual
2 union all
3 select Q1 || ',' || Q2 || ',' || Q3 from dataTable;
'MEANING1,MEANING2,MEANING3'
--------------------------------------------------------------------------------
meaning1, meaning2, meaning3
1,2,3
4,5,6

remove a varchar2 string from the middle of table data values

Data in the file_name field of the generation table should be an assigned number, then _01, _02, or _03, etc. and then .pdf (example 82617_01.pdf).
Somewhere, the program is putting a state name and sometimes a date/time stamp, between the assigned number and the 01, 02, etc. (82617_ALABAMA_01.pdf or 19998_MAINE_07-31-2010_11-05-59_AM.pdf or 5485325_OREGON_01.pdf for example).
We would like to develop a SQL statement to find the bad file names and fix them. In theory it seems rather simple to find file names that include a varchar2 data type and remove it, but putting the statement together is beyond me.
Any help or suggestions appreciated.
Something like:
UPDATE GENERATION
SET FILE_NAME (?)
WHERE FILE_NAME (?...LIKE '%STRING%');?
You can find the problem rows like this:
select *
from Files
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1
You can fix them like this:
update Files
set FILE_NAME = SUBSTR(FILE_NAME, 1, instr(FILE_NAME, '_') -1) ||
SUBSTR(FILE_NAME, instr(FILE_NAME, '_', 1, 2))
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1
SQL Fiddle Example
You can also use Regexp_replace function:
SQL> with t1(col) as(
2 select '82617_mm_01.pdf' from dual union all
3 select '456546_khkjh_89kjh_67_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col
8 , regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3') res
9 from t1;
COL RES
-------------------------------------- -----------------------------------------
82617_mm_01.pdf 82617_01.pdf
456546_khkjh_89kjh_67_01.pdf 456546_01.pdf
19998_MAINE_07-31-2010_11-05-59_AM.pdf 19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf 5485325_01.pdf
To display good or bad data regexp_like function will come in handy:
SQL> with t1(col) as(
2 select '826170_01.pdf' from dual union all
3 select '456546_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col bad_data
8 from t1
9 where not regexp_like(col, '^[0-9]+_\d{2}\.pdf$');
BAD_DATA
--------------------------------------
19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf
SQL> with t1(col) as(
2 select '826170_01.pdf' from dual union all
3 select '456546_01.pdf' from dual union all
4 select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
5 select '5485325_OREGON_01.pdf' from dual
6 )
7 select col good_data
8 from t1
9 where regexp_like(col, '^[0-9]+_\d{2}\.pdf$');
GOOD_DATA
--------------------------------------
826170_01.pdf
456546_01.pdf
To that end your update statement might look like this:
update your_table
set col = regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3');
--where clause if needed

Optimizing row by row (cursor) processing in Oracle 11g

I have to process a large table (2.5B records) row by row in order to keep track of two variables. As one can imagine, this is quite slow. I am looking for ideas on how to tune this procedure. Thank you.
declare
cursor c_data is select /* +index(data data_pk) */ * from data order by data_id;
r_data c_data%ROWTYPE;
lst_b_prc number(15,8);
lst_a_prc number(15,8);
begin
open c_data;
loop
fetch c_data into r_data;
exit when c_data%NOTFOUND;
if r_data.BATS = 'B' then
lst_b_prc := r_data.PRC;
end if;
if r_data.BATS = 'A' then
lst_a_prc := r_data.PRC;
end if;
if r_data.BATS = 'T' then
insert into trans .... lst_a_prc , lst_b_prc
end if;
end loop;
close c_data;
end;
The issue really comes down to finding efficient sql to track the latest PRC value when BATS='A' and BATS='B' for each BATS='T' record.
If I understand your problem correctly, with a table of data like this:
create table data as
select 1 data_id, 'T' bats, 1 prc from dual union all
select 2 data_id, 'A' bats, 2 prc from dual union all
select 3 data_id, 'B' bats, 3 prc from dual union all
select 4 data_id, 'T' bats, 4 prc from dual union all
select 5 data_id, 'A' bats, 5 prc from dual union all
select 6 data_id, 'T' bats, 6 prc from dual union all
select 7 data_id, 'B' bats, 7 prc from dual union all
select 8 data_id, 'T' bats, 8 prc from dual union all
select 9 data_id, 'T' bats, 9 prc from dual;
You you want to insert one row for each T, using the last PRC value for A and B. Which would look something like this:
T data_id Last A Last B
--------- ------ ------
1 null null
4 2 3
6 5 3
8 5 7
9 5 7
This query should work:
select data_id, last_A, last_B
from
(
select data_id, bats, prc
,max(case when bats = 'A' then prc else null end) over
(order by data_id
rows between unbounded preceding and current row) last_A
,max(case when bats = 'B' then prc else null end) over
(order by data_id
rows between unbounded preceding and current row) last_B
from data
)
where bats = 'T';
With so much data, you'll probably want to use direct path writes and parallelism.
The performance will largely depend on whether the sorting for the analytic functions can be done in memory or on disk. Optimizing memory can be very difficult, you'll probably need to work with a DBA to allow your process to use as much memory as possible without causing problems for other processes.
There are several options. Most importantly, you're probably keeping a huge UNDO/REDO log for all your inserts. You could occasionally commit your work, say every 1000 inserts.
Another option is to use a SQL MERGE statement (or simpler INSERT .. SELECT .. statement), that will allow your Oracle instance to operate on sets rather than on single records. The execution plan of your select might be optimised for optimal INSERT performance.

Resources