Remove coma separated string from another coma separated string in oracle - oracle

Column1 =A,B,C,D,E,F
Column2 =C,D,A,F,C,B (It can have duplicates)
I need to remove column2 values from column1 and get the missing value.
Desired output
(Column1)-(Column2) = E

Split columns' contents into rows, use MINUS set operator. Sample data in lines #1 - 3; query begins at line #4.
SQL> with test (col1, col2) as
2 (select 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual
3 )
4 select regexp_substr(col1, '[^,]+', 1, level) val
5 from test
6 connect by level <= regexp_count(col1, ',') + 1
7 minus
8 select regexp_substr(col2, '[^,]+', 1, level) val
9 from test
10 connect by level <= regexp_count(col2, ',') + 1
11 /
VAL
--------------------------------------------
E
SQL>
If you're comparing columns in a multi-row table, the above approach won't work OK as it'll retrieve duplicates and will be slow. In that case, rewrite it to
SQL> with test (id, col1, col2) as
2 (select 1, 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual union all
3 select 2, 'A,B,C,D,E,F', 'A,B,B,B' from dual
4 )
5 select id, listagg(val, ',') within group (order by val) missing_letters
6 from
7 (
8 select id,
9 regexp_substr(col1, '[^,]+', 1, column_value) val
10 from test cross join
11 table(cast(multiset(select level from dual
12 connect by level <= regexp_count(col1, ',') + 1
13 ) as sys.odcinumberlist))
14 minus
15 select id,
16 regexp_substr(col2, '[^,]+', 1, column_value) val
17 from test cross join
18 table(cast(multiset(select level from dual
19 connect by level <= regexp_count(col2, ',') + 1
20 ) as sys.odcinumberlist))
21 )
22 group by id;
ID MISSING_LETTERS
---------- --------------------
1 E
2 C,D,E,F
SQL>

You may use translate function with additional cleanup logic to remove all remaining commas. This will work only for single character replacement (one character between commas), but doesn't require to split string into tokens and uses simple string functions.
with a(col1, col2) as (
select 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual
)
select
/*Then remove leading and trailing commas*/
trim(',' from
/*Then condense all intermediate commas and spaces*/
regexp_replace(
/*Do actual replacement*/
translate(col1, replace(col2, ','), ' '),
'[, ]+', ','
)
) as res
from a
| RES |
| :-- |
| E |
db<>fiddle here

You do not need to split the string.
If your delimited values do not have any characters with special meaning in regular expressions then you can double-up the delimiters in col1 and then convert col2 to a regular expression and replace matches with an empty string and then remove the excess delimiters:
SELECT col1,
col2,
TRIM(
BOTH ',' FROM
REPLACE(
REGEXP_REPLACE(
',' || REPLACE(col1, ',', ',,') || ',',
',(' || REPLACE(col2, ',', '|') || '),'
),
',,',
','
)
) AS missing
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( col1, col2 ) AS
SELECT 'A,B,C,D,E,F', 'C,D,A,F,C,B' FROM DUAL UNION ALL
SELECT 'A,AB,BA,B,', 'A,B' FROM DUAL;
Outputs:
COL1
COL2
MISSING
A,B,C,D,E,F
C,D,A,F,C,B
E
A,AB,BA,B,
A,B
AB,BA
If you do have characters with special meaning then you can do a similar replacement using a recursive sub-query:
WITH replacements ( col1, col2 ) AS (
SELECT ',' || REPLACE( col1, ',', ',,') || ',',
col2 || ','
FROM table_name
UNION ALL
SELECT REPLACE(col1, ',' || SUBSTR(col2, 1, INSTR(col2, ','))),
SUBSTR(col2, INSTR(col2, ',') + 1)
FROM replacements
WHERE col2 IS NOT NULL
)
SELECT TRIM(BOTH ',' FROM REPLACE(col1, ',,', ',')) AS missing
FROM replacements
WHERE col2 IS NULL
Which outputs:
MISSING
AB,BA
E
Note: both of these queries only require a single table scan.
db<>fiddle here

Using ora:tokenize you could do something like this (including a few test cases in the with clause; you should remove it, and use your actual table and column names in the main query):
with
inputs (col1, col2) as (
select 'A,B,C,D,E,F', 'C,D,A,F,C,B' from dual union all
select 'D,,F' , 'F,A' from dual union all
select 'A,B,E,F' , 'E' from dual union all
select 'ABC' , 'A,B,ABC' from dual
)
-- END OF TEST DATA; QUERY BEGINS **BELOW THIS LINE**
select i.col1, i.col2, l.diff
from inputs i cross join lateral
( select listagg(token, ',') within group (order by null) as diff
from xmltable('ora:tokenize(.,",")' passing i.col1 || ','
columns token varchar2(10) path '.')
where not ',' || col2 || ',' like '%,' || token || ',%' ) l
;
COL1 COL2 DIFF
----------- ----------- --------------------
A,B,C,D,E,F C,D,A,F,C,B E
D,,F F,A D
A,B,E,F E A,B,F
ABC A,B,ABC

Related

ORACLE - How to use LAG to display strings from all previous rows into current row

I have data like below:
group
seq
activity
A
1
scan
A
2
visit
A
3
pay
B
1
drink
B
2
rest
I expect to have 1 new column "hist" like below:
group
seq
activity
hist
A
1
scan
NULL
A
2
visit
scan
A
3
pay
scan, visit
B
1
drink
NULL
B
2
rest
drink
I was trying to solve with LAG function, but LAG only returns one row from previous instead of multiple.
Truly appreciate any help!
Use a correlated sub-query:
SELECT t.*,
(SELECT LISTAGG(activity, ',') WITHIN GROUP (ORDER BY seq)
FROM table_name l
WHERE t."GROUP" = l."GROUP"
AND l.seq < t.seq
) AS hist
FROM table_name t
Or a hierarchical query:
SELECT t.*,
SUBSTR(SYS_CONNECT_BY_PATH(PRIOR activity, ','), 3) AS hist
FROM table_name t
START WITH seq = 1
CONNECT BY
PRIOR seq + 1 = seq
AND PRIOR "GROUP" = "GROUP"
Or a recursive sub-query factoring clause:
WITH rsqfc ("GROUP", seq, activity, hist) AS (
SELECT "GROUP", seq, activity, NULL
FROM table_name
WHERE seq = 1
UNION ALL
SELECT t."GROUP", t.seq, t.activity, r.hist || ',' || r.activity
FROM rsqfc r
INNER JOIN table_name t
ON (r."GROUP" = t."GROUP" AND r.seq + 1 = t.seq)
)
SEARCH DEPTH FIRST BY "GROUP" SET order_rn
SELECT "GROUP", seq, activity, SUBSTR(hist, 2) AS hist
FROM rsqfc
Which, for the sample data:
CREATE TABLE table_name ("GROUP", seq, activity) AS
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL;
All output:
GROUP
SEQ
ACTIVITY
HIST
A
1
scan
null
A
2
visit
scan
A
3
pay
scan,visit
B
1
drink
null
B
2
rest
drink
db<>fiddle here
To aggregate strings in Oracle we use LISAGG function.
In general, you need a windowing_clause to specify a sliding window for analytic function to calculate running total.
But unfortunately LISTAGG doesn't support it.
To simulate this behaviour you may use model_clause of the select statement. Below is an example with explanation.
select
group_
, activity
, seq
, hist
from t
model
/*Where to restart calculation*/
partition by (group_)
/*Add consecutive numbers to reference "previous" row per group.
May use "seq" column if its values are consecutive*/
dimension by (
row_number() over(
partition by group_
order by seq asc
) as rn
)
measures (
/*Other columnns to return*/
activity
, cast(null as varchar2(1000)) as hist
, seq
)
rules update (
/*Apply this rule sequentially*/
hist[any] order by rn asc =
/*Previous concatenated result*/
hist[cv()-1]
/*Plus comma for the third row and tne next rows*/
|| presentv(activity[cv()-2], ',', '') /**/
/*lus previous row's value*/
|| activity[cv()-1]
)
GROUP_ | ACTIVITY | SEQ | HIST
:----- | :------- | --: | :---------
A | scan | 1 | null
A | visit | 2 | scan
A | pay | 3 | scan,visit
B | drink | 1 | null
B | rest | 2 | drink
db<>fiddle here
Few more variants (without subqueries):
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
DBFIddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=9b477a2089d3beac62579d2b7103377a
Full test case with output:
with table_name ("GROUP", seq, activity) AS (
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL
)
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
GROUP SEQ ACTIV HIST1 HIST2
------ ---------- ----- ------------------------------ ------------------------------
A 1 scan
A 2 visit scan, scan
A 3 pay scan,visit, scan,visit
B 1 drink
B 2 rest drink, drink

SQL | SPLIT COLUMNS INTO ROWS

How can I split the column data into rows with basic SQL.
COL1 COL2
1 A-B
2 C-D
3 AAA-BB
Result
COL1 Col2
1 A
1 B
2 C
2 D
3 AAA
3 BB
From Oracle 12, if it is always two delimited values then you can use:
SELECT t.col1,
l.col2
FROM table_name t
CROSS JOIN LATERAL (
SELECT SUBSTR(col2, 1, INSTR(col2, '-') - 1) AS col2 FROM DUAL
UNION ALL
SELECT SUBSTR(col2, INSTR(col2, '-') + 1) FROM DUAL
) l
Which, for the sample data:
CREATE TABLE table_name (COL1, COL2) AS
SELECT 1, 'A-B' FROM DUAL UNION ALL
SELECT 2, 'C-D' FROM DUAL UNION ALL
SELECT 3, 'AAA-BB' FROM DUAL;
Outputs:
COL1
COL2
1
A
1
B
2
C
2
D
3
AAA
3
BB
db<>fiddle here
Snowflake is tagged, so here's the snowflake way of doing this:
WITH TEST (col1, col2) as
(select 1, 'A-B' from dual union all
select 2, 'C-D' from dual union all
select 3, 'AAA-BB' from dual
)
SELECT test.col1, table1.value
FROM test, LATERAL strtok_split_to_table(test.col2, '-') as table1
ORDER BY test.col1, table1.value;
As of Oracle:
SQL> with test (col1, col2) as
2 (select 1, 'A-B' from dual union all
3 select 2, 'C-D' from dual union all
4 select 3, 'AAA-BB' from dual
5 )
6 select col1,
7 regexp_substr(col2, '[^-]+', 1, column_value) col2
8 from test cross join
9 table(cast(multiset(select level from dual
10 connect by level <= regexp_count(col2, '-') + 1
11 ) as sys.odcinumberlist))
12 order by col1, col2;
COL1 COL2
---------- ------------------------
1 A
1 B
2 C
2 D
3 AAA
3 BB
6 rows selected.
SQL>
For MS-SQL 2016 and higher you can use:
SELECT Col1, x.value
FROM t CROSS APPLY STRING_SPLIT(t.Col2, '-') as x;
BTW: If Col2 contains null, it does not appear in the result.

How to compare 2 columns and return the difference in oracle SQL

We have 2 columns in one table in oracle SQL as
Col1= "there is book on the table"
Col2= "there are flowers on the chair"
Now I need the result as differed data in the column3 as new column col3.
The col3 result should be
"are flowers chair".
How to achieve this in oracle SQL??
You can use:
WITH words ( rid, col, name, id, word ) AS (
SELECT rid,
CASE INSTR(col, ' ')
WHEN 0
THEN NULL
ELSE SUBSTR(col, INSTR(col, ' ') + 1)
END,
name,
1,
CASE INSTR(col, ' ')
WHEN 0
THEN col
ELSE SUBSTR(col, 1, INSTR(col, ' ') - 1)
END
FROM ( SELECT ROWID AS rid, col1, col2 FROM table_name )
UNPIVOT ( col FOR name IN (col1, col2) )
UNION ALL
SELECT rid,
CASE INSTR(col, ' ')
WHEN 0
THEN NULL
ELSE SUBSTR(col, INSTR(col, ' ') + 1)
END,
name,
id + 1,
CASE INSTR(col, ' ')
WHEN 0
THEN col
ELSE SUBSTR(col, 1, INSTR(col, ' ') - 1)
END
FROM words
WHERE col IS NOT NULL
),
paired_words ( rid, id1, id2 ) AS (
SELECT c1.rid,
c1.id AS id1,
c2.id AS id2
FROM ( SELECT rid, id, word FROM words WHERE name = 'COL1' ) c1
INNER JOIN
( SELECT rid, id, word FROM words WHERE name = 'COL2' ) c2
ON (c1.rid = c2.rid AND c1.word = c2.word)
),
max_path ( rid, path ) AS (
SELECT rid,
path
FROM (
SELECT rid,
SYS_CONNECT_BY_PATH(id2, ',') || ',' AS path,
ROW_NUMBER() OVER (PARTITION BY rid ORDER BY LEVEL DESC) AS rn
FROM paired_words
CONNECT BY PRIOR rid = rid
AND PRIOR id1 < id1
AND PRIOR id2 < id2
)
WHERE rn = 1
)
SELECT LISTAGG(word, ' ') WITHIN GROUP (ORDER BY id) AS missing
FROM words w
WHERE NOT EXISTS (
SELECT 1
FROM max_path mp
WHERE w.rid = mp.rid
AND mp.path LIKE '%,' || w.id || ',%'
)
AND w.name = 'COL2'
GROUP BY rid;
Which, for the sample data:
CREATE TABLE table_name ( col1, col2 ) AS
SELECT 'there is book on the table', 'there are flowers on the chair' FROM DUAL UNION ALL
SELECT 'there is book on the table', 'there is a book on the table' FROM DUAL UNION ALL
SELECT 'there is book on the table', 'there is book there is book on the table on the table' FROM DUAL
Outputs:
MISSING
are flowers chair
a
there is book on the table
db<>fiddle here
Here's one option (which follows what you asked). Read comments within code.
SQL> with test (id, col1, col2) as
2 (select 1, 'there is book on the table',
3 'there are flowers on the chair'
4 from dual
5 ),
6 -- split sentences into words (each in its own line)
7 sent1 as
8 (select id,
9 column_value cv,
10 regexp_substr(col1, '[^ ]+', 1, column_value) word
11 from test cross join
12 table(cast(multiset(select level from dual
13 connect by level <= regexp_count(col1, ' ') + 1
14 ) as sys.odcinumberlist))
15 ),
16 sent2 as
17 (select id,
18 column_value cv,
19 regexp_substr(col2, '[^ ]+', 1, column_value) word
20 from test cross join
21 table(cast(multiset(select level from dual
22 connect by level <= regexp_count(col2, ' ') + 1
23 ) as sys.odcinumberlist))
24 )
25 -- final result
26 select a.id,
27 listagg(b.word, ' ') within group (order by a.cv) result
28 from sent2 b join sent1 a on a.id = b.id and a.cv = b.cv and a.word <> b.word
29 group by a.id;
ID RESULT
---------- ------------------------------
1 are flowers chair
SQL>

Oracle hierarchy query to find parents and child values From a table

I have a source table which contains values as :
Col1 Col2
A B
B C
E F
F G
G H
X Y
In this scenario A is a parent and b is child of A
And C is a grand child of A, parent and it's child with grand child's should come in one single line.
So the expected output is
Output :
A B C
E F G H
X Y
Oracle 11gR2 Setup:
CREATE TABLE table_name ( Col1, Col2 ) AS
SELECT 'A', 'B' FROM DUAL UNION ALL
SELECT 'B', 'C' FROM DUAL UNION ALL
SELECT 'E', 'F' FROM DUAL UNION ALL
SELECT 'F', 'G' FROM DUAL UNION ALL
SELECT 'G', 'H' FROM DUAL UNION ALL
SELECT 'X', 'Y' FROM DUAL;
Query:
SELECT SUBSTR( SYS_CONNECT_BY_PATH( Col1, ' ' ) || ' ' || Col2, 2 ) AS path
FROM table_name
WHERE CONNECT_BY_ISLEAF = 1
START WITH Col1 NOT IN ( SELECT Col2 FROM table_name )
CONNECT BY PRIOR Col2 = Col1;
Explanation:
Start (line 4) with each Col1 where there is not a parent row identified by a corresponding Col2 value and create a hierarchical query connecting (Line 5) Col1 the the prior parent row.
Filter the output only to those rows which are a leaf of the hierarchical tree (line 3) - i.e. those with no children.
You can then use SYS_CONNECT_BY_PATH to generate a string containing all the Col1 values from the root to the leaf of each branch of the tree generated by the hierarchy and concatenate that with the final Col2 value at the leaf. SUBSTR is used to remove the leading space delimiter that SYS_CONNECT_BY_PATH prepends to each entry in the path.
Output:
PATH
-------
A B C
E F G H
X Y
Is it what you search for?
SQL> with
2 src as (select 'A' p#, 'B' c# from dual union all
3 select 'B' p#, 'C' c# from dual union all
4 select 'E' p#, 'F' c# from dual union all
5 select 'F' p#, 'G' c# from dual union all
6 select 'G' p#, 'H' c# from dual union all
7 select 'X' p#, 'Y' c# from dual)
8 select
9 max(trim(sys_connect_by_path(p#, ' ') || ' ' || c#)) r#
10 from
11 src
12 start with
13 p# not in (select c# from src)
14 connect by p# = prior c#
15 group by connect_by_root(p#);
R#
--------------------------------------------------------------------------------
A B C
X Y
E F G H
May be this code can help you.
WITH t1(id, parent_id) AS (
-- Anchor member.
SELECT id,
PARENT
FROM table
WHERE id = 'A'
UNION ALL
-- Recursive member.
SELECT t2.id,
t2.PARENT
FROM table t2, table t1
WHERE t2.PARENT = t1.id
)
SELECT id, parent_id
FROM t1;

How to get the following result?

How to get the following result ?
Input : -
t1
--------------
col1 col2 col3
--------------
101, abc, 100
101, xyz, 200
101, rst, 300
-------------
Output : -
101 abc 100 xyz 200 rst 300
Please try:
SELECT col1, replace(wm_concat(col2||col3),',', '') FROM t1 GROUP BY col1;
or
SELECT col1, (SELECT XMLAGG(xmlelement(X, X1.col2||col3)order by X1.col2).extract('//text()')
FROM t1 X1 WHERE X1.col1=X.col1)
FROM t1 X
This works with 11g and maintains the order of items.
select LISTAGG (code, ' ') WITHIN GROUP (ORDER BY rn, coln)
from(
select code, min(rn) as rn, min(coln) as coln
from(
select col1 as code, rownum rn, 1 as coln from t order by col1, col3
union all
select col2 as code, rownum rn, 2 as coln from t order by col1, col3
union all
select col3 as code, rownum rn, 3 as coln from t order by col1, col3
)
group by code
)
Try with
select col1 || ' ' || listagg( col2 || ' ' || col3, ' ' ) within group ( order by COL3 )
from t1
group by col1;

Resources