Oracle - Performance between Regexp_substr and Instr - performance

As my title, somecases I see Regexp_substr faster and less cost than Instr and somecases its opposite.
I don't know when I should use Instr or Regexp_substr, someone can explain for me and tell me benefit of each? The example following:
**Regexp_substr:**
SELECT * FROM tabl1
WHERE 1 = 1
AND col1 IN (
SELECT regexp_substr(abc,'[^,]+',1,level) AS A
FROM (
SELECT 001 abc -- replace with parameter
FROM DUAL
)
CONNECT BY LEVEL <= LENGTH (REGEXP_REPLACE (abc,'[^,]'))+1 );
**Instr:**
SELECT * FROM tabl1
WHERE 1 = 1
AND INSTR (',' || '001' || ',',',' || col1 || ',') > 0 ;
Thanks!

Related

Remove coma separated string from another coma separated string in oracle

Column1 =A,B,C,D,E,F
Column2 =C,D,A,F,C,B (It can have duplicates)
I need to remove column2 values from column1 and get the missing value.
Desired output
(Column1)-(Column2) = E
Split columns' contents into rows, use MINUS set operator. Sample data in lines #1 - 3; query begins at line #4.
SQL> with test (col1, col2) as
2 (select 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual
3 )
4 select regexp_substr(col1, '[^,]+', 1, level) val
5 from test
6 connect by level <= regexp_count(col1, ',') + 1
7 minus
8 select regexp_substr(col2, '[^,]+', 1, level) val
9 from test
10 connect by level <= regexp_count(col2, ',') + 1
11 /
VAL
--------------------------------------------
E
SQL>
If you're comparing columns in a multi-row table, the above approach won't work OK as it'll retrieve duplicates and will be slow. In that case, rewrite it to
SQL> with test (id, col1, col2) as
2 (select 1, 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual union all
3 select 2, 'A,B,C,D,E,F', 'A,B,B,B' from dual
4 )
5 select id, listagg(val, ',') within group (order by val) missing_letters
6 from
7 (
8 select id,
9 regexp_substr(col1, '[^,]+', 1, column_value) val
10 from test cross join
11 table(cast(multiset(select level from dual
12 connect by level <= regexp_count(col1, ',') + 1
13 ) as sys.odcinumberlist))
14 minus
15 select id,
16 regexp_substr(col2, '[^,]+', 1, column_value) val
17 from test cross join
18 table(cast(multiset(select level from dual
19 connect by level <= regexp_count(col2, ',') + 1
20 ) as sys.odcinumberlist))
21 )
22 group by id;
ID MISSING_LETTERS
---------- --------------------
1 E
2 C,D,E,F
SQL>
You may use translate function with additional cleanup logic to remove all remaining commas. This will work only for single character replacement (one character between commas), but doesn't require to split string into tokens and uses simple string functions.
with a(col1, col2) as (
select 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual
)
select
/*Then remove leading and trailing commas*/
trim(',' from
/*Then condense all intermediate commas and spaces*/
regexp_replace(
/*Do actual replacement*/
translate(col1, replace(col2, ','), ' '),
'[, ]+', ','
)
) as res
from a
| RES |
| :-- |
| E |
db<>fiddle here
You do not need to split the string.
If your delimited values do not have any characters with special meaning in regular expressions then you can double-up the delimiters in col1 and then convert col2 to a regular expression and replace matches with an empty string and then remove the excess delimiters:
SELECT col1,
col2,
TRIM(
BOTH ',' FROM
REPLACE(
REGEXP_REPLACE(
',' || REPLACE(col1, ',', ',,') || ',',
',(' || REPLACE(col2, ',', '|') || '),'
),
',,',
','
)
) AS missing
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( col1, col2 ) AS
SELECT 'A,B,C,D,E,F', 'C,D,A,F,C,B' FROM DUAL UNION ALL
SELECT 'A,AB,BA,B,', 'A,B' FROM DUAL;
Outputs:
COL1
COL2
MISSING
A,B,C,D,E,F
C,D,A,F,C,B
E
A,AB,BA,B,
A,B
AB,BA
If you do have characters with special meaning then you can do a similar replacement using a recursive sub-query:
WITH replacements ( col1, col2 ) AS (
SELECT ',' || REPLACE( col1, ',', ',,') || ',',
col2 || ','
FROM table_name
UNION ALL
SELECT REPLACE(col1, ',' || SUBSTR(col2, 1, INSTR(col2, ','))),
SUBSTR(col2, INSTR(col2, ',') + 1)
FROM replacements
WHERE col2 IS NOT NULL
)
SELECT TRIM(BOTH ',' FROM REPLACE(col1, ',,', ',')) AS missing
FROM replacements
WHERE col2 IS NULL
Which outputs:
MISSING
AB,BA
E
Note: both of these queries only require a single table scan.
db<>fiddle here
Using ora:tokenize you could do something like this (including a few test cases in the with clause; you should remove it, and use your actual table and column names in the main query):
with
inputs (col1, col2) as (
select 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual union all
select 'D,,F' , 'F,A' from dual union all
select 'A,B,E,F' , 'E' from dual union all
select 'ABC' , 'A,B,ABC' from dual
)
-- END OF TEST DATA; QUERY BEGINS **BELOW THIS LINE**
select i.col1, i.col2, l.diff
from inputs i cross join lateral
( select listagg(token, ',') within group (order by null) as diff
from xmltable('ora:tokenize(.,",")' passing i.col1 || ','
columns token varchar2(10) path '.')
where not ',' || col2 || ',' like '%,' || token || ',%' ) l
;
COL1 COL2 DIFF
----------- ----------- --------------------
A,B,C,D,E,F C,D,A,F,C,B E
D,,F F,A D
A,B,E,F E A,B,F
ABC A,B,ABC

How to convert delimited string to a PL/SQL table for JOINing?

I have the following table:
CREATE TABLE T_DATA
(
id VARCHAR2(20),
value VARCHAR2(30),
index NUMBER,
valid_from DATE,
entry_state VARCHAR2(1),
CONSTRAINT PK_T_DATA PRIMARY KEY(id, value)
);
and I have the following string:
id1:value1,id2:value2,id3:value3...
where id and value are actually corresponding values on T_DATA. I'm expected to use that string and return a resultset from T_DATA usind the ids and values provided as filters (basically, a select). I was told I can convert the string into a PL/SQL table with the two columns and with that, a simple SELECT * FROM T_DATA INNER JOIN [PL/SQL table] ON [fields] will retrieve the rows required, but I can't find out how to convert the string to a PL/SQL table with multiple columns. How can I do it?
The simplest solution I can think of (although it may not be the most efficient) is to just use a simple INSTR
WITH
t_data
AS
( SELECT 'id' || ROWNUM AS id,
'value' || ROWNUM AS VALUE,
ROWNUM AS index_num,
SYSDATE - ROWNUM AS valid_from,
'A' AS entry_state
FROM DUAL
CONNECT BY LEVEL <= 10)
SELECT *
FROM t_data
WHERE INSTR ('id1:value1,id3:value3', id || ':' || VALUE) > 0;
If you want to split the search string, you can try a query like this one
WITH
t_data
AS
( SELECT 'id' || ROWNUM AS id,
'value' || ROWNUM AS VALUE,
ROWNUM AS index_num,
SYSDATE - ROWNUM AS valid_from,
'A' AS entry_state
FROM DUAL
CONNECT BY LEVEL <= 10),
split_string AS (SELECT 'id1:value1,id3:value3' AS str FROM DUAL),
split_data as (
SELECT substr(regexp_substr(str, '[^,]+', 1, LEVEL),1,instr(regexp_substr(str, '[^,]+', 1, LEVEL), ':') - 1) as id,
substr(regexp_substr(str, '[^,]+', 1, LEVEL),instr(regexp_substr(str, '[^,]+', 1, LEVEL), ':') + 1) as value
FROM split_string
CONNECT BY INSTR (str, ',', 1, LEVEL - 1) > 0)
SELECT t.*
FROM t_data t
join split_data s
on( t.id = s.id and t.value = s.value);
You can use the query using LIKE as follows:
SELECT *
FROM T_DATA
WHERE ',' || YOUR_STRING || ',' LIKE '%,' || ID || ':' || VALUE || ',%'

Redundant blank line in query result

I have this sql:
with p_1 as
(
select 1 sorszam, 'X1' tipus from dual
union all select 2 sorszam, 'X2' tipus from dual
union all select 3 sorszam, 'X3' tipus from dual
)
select (
(case when p1.sorszam=1 then ('[' || chr(13) || chr(10)) else '' end) ||
p1.tipus
|| (case when p1.sorszam=(select max(sorszam) from p_1) then (chr(13) || chr(10) || ']') else '' end)
) szoveg
from p_1 p1
order by p1.sorszam
The result is:
SZOVEG
--------
[
X1
X2
X3
]
My question is: why is there a blank line after the first line?
Using SET RECSEP OFF removes the record separator.
http://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12040.htm#i2699269

Oracle PL/SQL Tokenize String with empty position

I've a String like this:
AAA,BBB,,DDD
And i would like to tokenize it using comma and retrieve a table like this:
VALUE LEVEL
AAA 1
BBB 2
(null) 3
DDD 4
I need to know the String and in witch position i found it, without missing null String.
I've tried a code like this but i miss the empty position:
SELECT regexp_substr ('AAA,BBB,,DDD', '[^,]+', 1, level), level
FROM dual
CONNECT BY LEVEL <= LENGTH(regexp_replace ('AAA,BBB,,DDD', '[^,]+'));
The output is this:
VALUE LEVEL
AAA 1
BBB 2
DDD 3
Another simple answer is that replacing comma(,) by space with comma(, ) like below
SELECT trim(regexp_substr (replace('AAA,BBB,,DDD',',',', '), '[^,]+', 1, level)), level
FROM dual
CONNECT BY LEVEL <= REGEXP_COUNT (replace('AAA,BBB,,DDD',',',', '), '[^,]+');
this also works http://sqlfiddle.com/#!4/b255d/26
SELECT token, lvl FROM (
SELECT regexp_substr ('AAA,BBB,,DDD', '[^,]*', 1, LEVEL) token, LEVEL lvl,
lag(regexp_substr ('AAA,BBB,,DDD', '[^,]*', 1, LEVEL)) over(order by level) prev_token
FROM dual
CONNECT BY LEVEL <= LENGTH(regexp_replace ('AAA,BBB,,DDD', '[^,]+'))*2
) WHERE prev_token is null;
in oracle 11g and upper you can do something like this query:
with
tab1(pointer,test,split_test) as
(select
1 as pointer,test,substr(test,0,case when instr(test,',',1,1) = 0 then LENGTH(test)
else instr(test,',',1,1)-1 end) split_test from table1
union all
select
pointer + 1 as pointer,test,
substr(test,instr(test,',',1,pointer) + 1,case when instr(test,',',1,pointer + 1) = 0 then LENGTH(test) else
instr(test,',',1,pointer + 1) - instr(test,',',1,pointer) - 1 end) split_test
from tab1 where pointer - 1 < LENGTH(test)-LENGTH(REPLACE(test,',','')))
select split_test as "value",pointer as "level" from tab1;
SQL Fiddle

Oracle Count how much values in varchar

I'm testing with a query:
SELECT '-1,-2,-4,-6' "Verdunning"
FROM DUAL
Now I need to know how much values are in the varchar: '-1,-2,-4,-6'. I want to have 4 back.
and when it is '-1,-2,-4,-6,-8' i need to get 5 back. How to do this in an oracle select statement?
Fiddle
SELECT LENGTH(Verdunning) - LENGTH(REPLACE(Verdunning, ',', '')) + 1
FROM (SELECT '-1,-2,-4,-6' AS Verdunning FROM DUAL) T
find number of , then add 1 to get count
Select LEN(column) – LEN(REPLACE(column, ',', ''))+1 as comma_separted_values_count
From table
Try this:
SELECT
REGEXP_COUNT ( REGEXP_REPLACE ( '-1,-2,-4,-6', '".*?"' ), ',' ) + 1
FROM
DUAL;
Using regexp_substr.
select count(regexp_substr( '-1,-2,-4,-6','[^,]+', 1, level)) from dual
connect by regexp_substr( '-1,-2,-4,-6', '[^,]+', 1, level) is not null;

Resources