Is it possible to count and also group by comma delimited values in the oracle database table? This is a table data example:
id | user | title |
1 | foo | a,b,c |
2 | bar | a,d |
3 | tee | b |
The expected result would be:
title | count
a | 2
b | 2
c | 1
d | 1
I wanted to use concat like this:
SELECT a.title FROM Account a WHERE concat(',', a.title, ',') LIKE 'a' OR concat(',', a.title, ',') LIKE 'b' ... GROUP BY a.title?
But I'm getting invalid number of arguments on concat. The title values are predefined, therefore I don't mind if I have to list all of them in the query. Any help is greatly appreciated.
This uses simple string functions and a recursive sub-query factoring and may be faster than using regular expressions and correlated joins:
Oracle Setup:
CREATE TABLE account ( id, "user", title ) AS
SELECT 1, 'foo', 'a,b,c' FROM DUAL UNION ALL
SELECT 2, 'bar', 'a,d' FROM DUAL UNION ALL
SELECT 3, 'tee', 'b' FROM DUAL;
Query:
WITH positions ( title, start_pos, end_pos ) AS (
SELECT title,
1,
INSTR( title, ',', 1 )
FROM account
UNION ALL
SELECT title,
end_pos + 1,
INSTR( title, ',', end_pos + 1 )
FROM positions
WHERE end_pos > 0
),
items ( item ) AS (
SELECT CASE end_pos
WHEN 0
THEN SUBSTR( title, start_pos )
ELSE SUBSTR( title, start_pos, end_pos - start_pos )
END
FROM positions
)
SELECT item,
COUNT(*)
FROM items
GROUP BY item
ORDER BY item;
Output:
ITEM | COUNT(*)
:--- | -------:
a | 2
b | 2
c | 1
d | 1
db<>fiddle here
Split titles to rows and count them.
SQL> with test (id, title) as
2 (select 1, 'a,b,c' from dual union all
3 select 2, 'a,d' from dual union all
4 select 3, 'b' from dual
5 ),
6 temp as
7 (select regexp_substr(title, '[^,]', 1, column_value) val
8 from test cross join table(cast(multiset(select level from dual
9 connect by level <= regexp_count(title, ',') + 1
10 ) as sys.odcinumberlist))
11 )
12 select val as title,
13 count(*)
14 From temp
15 group by val
16 order by val;
TITLE COUNT(*)
-------------------- ----------
a 2
b 2
c 1
d 1
SQL>
If titles aren't that simple, then modify REGEXP_SUBSTR (add + sign) in line #7, e.g.
SQL> with test (id, title) as
2 (select 1, 'Robin Hood,Avatar,Star Wars Episode III' from dual union all
3 select 2, 'Mickey Mouse,Avatar' from dual union all
4 select 3, 'The Godfather' from dual
5 ),
6 temp as
7 (select regexp_substr(title, '[^,]+', 1, column_value) val
8 from test cross join table(cast(multiset(select level from dual
9 connect by level <= regexp_count(title, ',') + 1
10 ) as sys.odcinumberlist))
11 )
12 select val as title,
13 count(*)
14 From temp
15 group by val
16 order by val;
TITLE COUNT(*)
------------------------------ ----------
Avatar 2
Mickey Mouse 1
Robin Hood 1
Star Wars Episode III 1
The Godfather 1
SQL>
Related
I have data like below:
group
seq
activity
A
1
scan
A
2
visit
A
3
pay
B
1
drink
B
2
rest
I expect to have 1 new column "hist" like below:
group
seq
activity
hist
A
1
scan
NULL
A
2
visit
scan
A
3
pay
scan, visit
B
1
drink
NULL
B
2
rest
drink
I was trying to solve with LAG function, but LAG only returns one row from previous instead of multiple.
Truly appreciate any help!
Use a correlated sub-query:
SELECT t.*,
(SELECT LISTAGG(activity, ',') WITHIN GROUP (ORDER BY seq)
FROM table_name l
WHERE t."GROUP" = l."GROUP"
AND l.seq < t.seq
) AS hist
FROM table_name t
Or a hierarchical query:
SELECT t.*,
SUBSTR(SYS_CONNECT_BY_PATH(PRIOR activity, ','), 3) AS hist
FROM table_name t
START WITH seq = 1
CONNECT BY
PRIOR seq + 1 = seq
AND PRIOR "GROUP" = "GROUP"
Or a recursive sub-query factoring clause:
WITH rsqfc ("GROUP", seq, activity, hist) AS (
SELECT "GROUP", seq, activity, NULL
FROM table_name
WHERE seq = 1
UNION ALL
SELECT t."GROUP", t.seq, t.activity, r.hist || ',' || r.activity
FROM rsqfc r
INNER JOIN table_name t
ON (r."GROUP" = t."GROUP" AND r.seq + 1 = t.seq)
)
SEARCH DEPTH FIRST BY "GROUP" SET order_rn
SELECT "GROUP", seq, activity, SUBSTR(hist, 2) AS hist
FROM rsqfc
Which, for the sample data:
CREATE TABLE table_name ("GROUP", seq, activity) AS
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL;
All output:
GROUP
SEQ
ACTIVITY
HIST
A
1
scan
null
A
2
visit
scan
A
3
pay
scan,visit
B
1
drink
null
B
2
rest
drink
db<>fiddle here
To aggregate strings in Oracle we use LISAGG function.
In general, you need a windowing_clause to specify a sliding window for analytic function to calculate running total.
But unfortunately LISTAGG doesn't support it.
To simulate this behaviour you may use model_clause of the select statement. Below is an example with explanation.
select
group_
, activity
, seq
, hist
from t
model
/*Where to restart calculation*/
partition by (group_)
/*Add consecutive numbers to reference "previous" row per group.
May use "seq" column if its values are consecutive*/
dimension by (
row_number() over(
partition by group_
order by seq asc
) as rn
)
measures (
/*Other columnns to return*/
activity
, cast(null as varchar2(1000)) as hist
, seq
)
rules update (
/*Apply this rule sequentially*/
hist[any] order by rn asc =
/*Previous concatenated result*/
hist[cv()-1]
/*Plus comma for the third row and tne next rows*/
|| presentv(activity[cv()-2], ',', '') /**/
/*lus previous row's value*/
|| activity[cv()-1]
)
GROUP_ | ACTIVITY | SEQ | HIST
:----- | :------- | --: | :---------
A | scan | 1 | null
A | visit | 2 | scan
A | pay | 3 | scan,visit
B | drink | 1 | null
B | rest | 2 | drink
db<>fiddle here
Few more variants (without subqueries):
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
DBFIddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=9b477a2089d3beac62579d2b7103377a
Full test case with output:
with table_name ("GROUP", seq, activity) AS (
SELECT 'A', 1, 'scan' FROM DUAL UNION ALL
SELECT 'A', 2, 'visit' FROM DUAL UNION ALL
SELECT 'A', 3, 'pay' FROM DUAL UNION ALL
SELECT 'B', 1, 'drink' FROM DUAL UNION ALL
SELECT 'B', 2, 'rest' FROM DUAL
)
SELECT--+ NO_XML_QUERY_REWRITE
t.*,
regexp_substr(
listagg(activity, ',')
within group(order by SEQ)
over(partition by "GROUP")
,'^([^,]+,){'||(row_number()over(partition by "GROUP" order by seq)-1)||'}'
)
AS hist1
,xmlcast(
xmlquery(
'string-join($X/A/B[position()<$Y]/text(),",")'
passing
xmlelement("A", xmlagg(xmlelement("B", activity)) over(partition by "GROUP")) as x
,row_number()over(partition by "GROUP" order by seq) as y
returning content
)
as varchar2(1000)
) hist2
FROM table_name t;
GROUP SEQ ACTIV HIST1 HIST2
------ ---------- ----- ------------------------------ ------------------------------
A 1 scan
A 2 visit scan, scan
A 3 pay scan,visit, scan,visit
B 1 drink
B 2 rest drink, drink
We have 2 columns in one table in oracle SQL as
Col1= "there is book on the table"
Col2= "there are flowers on the chair"
Now I need the result as differed data in the column3 as new column col3.
The col3 result should be
"are flowers chair".
How to achieve this in oracle SQL??
You can use:
WITH words ( rid, col, name, id, word ) AS (
SELECT rid,
CASE INSTR(col, ' ')
WHEN 0
THEN NULL
ELSE SUBSTR(col, INSTR(col, ' ') + 1)
END,
name,
1,
CASE INSTR(col, ' ')
WHEN 0
THEN col
ELSE SUBSTR(col, 1, INSTR(col, ' ') - 1)
END
FROM ( SELECT ROWID AS rid, col1, col2 FROM table_name )
UNPIVOT ( col FOR name IN (col1, col2) )
UNION ALL
SELECT rid,
CASE INSTR(col, ' ')
WHEN 0
THEN NULL
ELSE SUBSTR(col, INSTR(col, ' ') + 1)
END,
name,
id + 1,
CASE INSTR(col, ' ')
WHEN 0
THEN col
ELSE SUBSTR(col, 1, INSTR(col, ' ') - 1)
END
FROM words
WHERE col IS NOT NULL
),
paired_words ( rid, id1, id2 ) AS (
SELECT c1.rid,
c1.id AS id1,
c2.id AS id2
FROM ( SELECT rid, id, word FROM words WHERE name = 'COL1' ) c1
INNER JOIN
( SELECT rid, id, word FROM words WHERE name = 'COL2' ) c2
ON (c1.rid = c2.rid AND c1.word = c2.word)
),
max_path ( rid, path ) AS (
SELECT rid,
path
FROM (
SELECT rid,
SYS_CONNECT_BY_PATH(id2, ',') || ',' AS path,
ROW_NUMBER() OVER (PARTITION BY rid ORDER BY LEVEL DESC) AS rn
FROM paired_words
CONNECT BY PRIOR rid = rid
AND PRIOR id1 < id1
AND PRIOR id2 < id2
)
WHERE rn = 1
)
SELECT LISTAGG(word, ' ') WITHIN GROUP (ORDER BY id) AS missing
FROM words w
WHERE NOT EXISTS (
SELECT 1
FROM max_path mp
WHERE w.rid = mp.rid
AND mp.path LIKE '%,' || w.id || ',%'
)
AND w.name = 'COL2'
GROUP BY rid;
Which, for the sample data:
CREATE TABLE table_name ( col1, col2 ) AS
SELECT 'there is book on the table', 'there are flowers on the chair' FROM DUAL UNION ALL
SELECT 'there is book on the table', 'there is a book on the table' FROM DUAL UNION ALL
SELECT 'there is book on the table', 'there is book there is book on the table on the table' FROM DUAL
Outputs:
MISSING
are flowers chair
a
there is book on the table
db<>fiddle here
Here's one option (which follows what you asked). Read comments within code.
SQL> with test (id, col1, col2) as
2 (select 1, 'there is book on the table',
3 'there are flowers on the chair'
4 from dual
5 ),
6 -- split sentences into words (each in its own line)
7 sent1 as
8 (select id,
9 column_value cv,
10 regexp_substr(col1, '[^ ]+', 1, column_value) word
11 from test cross join
12 table(cast(multiset(select level from dual
13 connect by level <= regexp_count(col1, ' ') + 1
14 ) as sys.odcinumberlist))
15 ),
16 sent2 as
17 (select id,
18 column_value cv,
19 regexp_substr(col2, '[^ ]+', 1, column_value) word
20 from test cross join
21 table(cast(multiset(select level from dual
22 connect by level <= regexp_count(col2, ' ') + 1
23 ) as sys.odcinumberlist))
24 )
25 -- final result
26 select a.id,
27 listagg(b.word, ' ') within group (order by a.cv) result
28 from sent2 b join sent1 a on a.id = b.id and a.cv = b.cv and a.word <> b.word
29 group by a.id;
ID RESULT
---------- ------------------------------
1 are flowers chair
SQL>
I'm trying to build a recursive query and I'm facing a problem.
please find below my dataset
WITH table1 ( ID, Code, Label ) as(
SELECT 123, 'C1', 'LABEL_1' from dual UNION ALL
SELECT 1, 'C2', 'LABEL_2' from dual UNION ALL
SELECT 30, 'C3', 'LABEL_3' from dual UNION ALL
SELECT 44, 'C4', 'LABEL_4' from dual UNION ALL
SELECT 5, 'C5', 'LABEL_5' from dual
),
table2 ( ID, id_table1, code_child, label_child ) as (
SELECT 1, 123, 'C1_1','LABEL_1_1' from dual UNION ALL
SELECT 2, 123, 'C1_2','LABEL_1_2' from dual UNION ALL
SELECT 3, 123, 'C1_3','LABEL_1_3' from dual UNION ALL
SELECT 4, 123, 'C1_4','LABEL_1_4' from dual UNION ALL
SELECT 6, 30, 'C3_1','LABEL_3_1' from dual UNION ALL
SELECT 7, 30, 'C3_2','LABEL_3_2' from dual UNION ALL
SELECT 8, 30, 'C3_3','LABEL_3_3' from dual UNION ALL
SELECT 9, 30, 'C3_4','LABEL_3_4' from dual UNION ALL
SELECT 10, 5, 'C5_1','LABEL_5_1' from dual
),
hierarchy as (
Select
a.id, code, label, CODE_CHILD,id_table1
from table1 a
left join table2 b on b.id_table1 = a.ID
)
,recursive (base, id, code, label, CODE_CHILD,id_table1) as (
SELECT
id as base,
id,
code,
label,
CODE_CHILD,
id_table1
FROM hierarchy
UNION ALL
SELECT
previous_level.base,
current_level.id,
current_level.code,
current_level.label,
current_level.CODE_CHILD,
current_level.id_table1
FROM recursive previous_level,
hierarchy current_level
WHERE 1=1
and current_level.id = previous_level.id_table1
)
SELECT * FROM recursive order by base;
And i'm getting this error :
32044. 00000 - "cycle detected while executing recursive WITH query"
*Cause: A recursive WITH clause query produced a cycle and was stopped
in order to avoid an infinite loop.
*Action: Rewrite the recursive WITH query to stop the recursion or use
the CYCLE clause.
Where i'm wrong ?
I need to merge these two tables into one.
here's what I'd like to get as a result.
id code label id_parent
1 C1 LABEL_1
2 C2 LABEL_2
3 C3 LABEL_3
4 C4 LABEL_4
5 C5 LABEL_5
6 C1_1 LABEL_1_1 1
7 C1_2 LABEL_1_2 1
8 C1_3 LABEL_1_3 1
9 C1_4 LABEL_1_4 1
10 C3_1 LABEL_3_1 3
11 C3_2 LABEL_3_2 3
12 C3_3 LABEL_3_3 3
13 C3_4 LABEL_3_4 3
14 C5_1 LABEL_5_1 5
Thank you
Not sure why you want a recursive query? It appears that you could just use UNION ALL and join the two tables:
WITH table1 ( ID, Code, Label ) as(
SELECT 1, 'C1', 'LABEL_1' from dual UNION ALL
SELECT 2, 'C2', 'LABEL_2' from dual UNION ALL
SELECT 3, 'C3', 'LABEL_3' from dual UNION ALL
SELECT 4, 'C4', 'LABEL_4' from dual UNION ALL
SELECT 5, 'C5', 'LABEL_5' from dual
),
table2 ( ID, id_table1, code_child, label_child ) as (
SELECT 1, 1, 'C1_1','LABEL_1_1' from dual UNION ALL
SELECT 2, 1, 'C1_2','LABEL_1_2' from dual UNION ALL
SELECT 3, 1, 'C1_3','LABEL_1_3' from dual UNION ALL
SELECT 4, 1, 'C1_4','LABEL_1_4' from dual UNION ALL
SELECT 6, 3, 'C3_1','LABEL_3_1' from dual UNION ALL
SELECT 7, 3, 'C3_2','LABEL_3_2' from dual UNION ALL
SELECT 8, 3, 'C3_3','LABEL_3_3' from dual UNION ALL
SELECT 9, 3, 'C3_4','LABEL_3_4' from dual UNION ALL
SELECT 10, 5, 'C5_1','LABEL_5_1' from dual
)
SELECT ROW_NUMBER() OVER ( ORDER BY table_no, code ) AS id,
code,
label,
id_parent
FROM (
SELECT code,
label,
1 AS table_no,
NULL AS id_parent
FROM table1
UNION ALL
SELECT code_child,
label_child,
2 AS table_no,
id_table1
FROM table2
)
order by table_no, code;
Which outputs:
ID | CODE | LABEL | ID_PARENT
-: | :--- | :-------- | --------:
1 | C1 | LABEL_1 | null
2 | C2 | LABEL_2 | null
3 | C3 | LABEL_3 | null
4 | C4 | LABEL_4 | null
5 | C5 | LABEL_5 | null
6 | C1_1 | LABEL_1_1 | 1
7 | C1_2 | LABEL_1_2 | 1
8 | C1_3 | LABEL_1_3 | 1
9 | C1_4 | LABEL_1_4 | 1
10 | C3_1 | LABEL_3_1 | 3
11 | C3_2 | LABEL_3_2 | 3
12 | C3_3 | LABEL_3_3 | 3
13 | C3_4 | LABEL_3_4 | 3
14 | C5_1 | LABEL_5_1 | 5
db<>fiddle here
A recursive WITH clause query produced a cycle and was stopped in order to avoid an infinite loop.
This issue is coming due to bad data in the DB. There are some records which are causing circular relationship among them which is causing infinite loops.
For example: P is parent of C and C is again parent of P.
You can fetch the above output simple using UNION ALL and join of the tables.
My column value looks something like below: [Just an example i created]
{BASICINFOxxxFyyy100x} {CONTACTxxx12345yyy20202x}
It can contain 0 or more blocks of data... I have created the below query to split the blocks
with x as
(select
'{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' a from dual)
select REGEXP_SUBSTR(a,'({.*?x})',1,rownum,null,1)
from x
connect by rownum <= REGEXP_COUNT(a,'x}')
However I would like to further split the output into 3 columns like below:
ColumnA | ColumnB | ColumnC
------------------------------
BASICINFO | F |100
CONTACT | 12345 |20202
The delimiters are always standard. I failed to create a pretty query which gives me the desired output.
Thanks in advance.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE your_table ( str ) AS
SELECT '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' from dual
/
Query 1:
select REGEXP_SUBSTR(
t.str,
'\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}',
1,
l.COLUMN_VALUE,
NULL,
1
) AS col1,
REGEXP_SUBSTR(
str,
'\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}',
1,
l.COLUMN_VALUE,
NULL,
2
) AS col2,
REGEXP_SUBSTR(
str,
'\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}',
1,
l.COLUMN_VALUE,
NULL,
3
) AS col3
FROM your_table t
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( t.str,'\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}')
) AS SYS.ODCINUMBERLIST
)
) l
Results:
| COL1 | COL2 | COL3 |
|-----------|-------|-------|
| BASICINFO | F | 100 |
| CONTACT | 12345 | 20202 |
Note:
Your query:
select REGEXP_SUBSTR(a,'({.*?x})',1,rownum,null,1)
from x
connect by rownum <= REGEXP_COUNT(a,'x}')
Will not work when you have multiple rows of input - In the CONNECT BY clause, the hierarchical query has nothing to restrict it connecting Row1-Level2 to Row1-Level1 or to Row2-Level1 so it will connect it to both and as the depth of the hierarchies gets greater it will create exponentially more duplicate copies of the output rows. There are hacks you can use to stop this but it is much more efficient to put the row generator into a correlated sub-query which can then be CROSS JOINed back to the original table (it is correlated so it won't join to the wrong rows) if you are going to use hierarchical queries.
Better yet would be to fix your data structure so you are not storing multiple values in delimited strings.
SQL> with x as
2 (select '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' a from dual
3 ),
4 y as (
5 select REGEXP_SUBSTR(a,'({.*?x})',1,rownum,null,1) c1
6 from x
7 connect by rownum <= REGEXP_COUNT(a,'x}')
8 )
9 select
10 substr(c1,2,instr(c1,'xxx')-2) z1,
11 substr(c1,instr(c1,'xxx')+3,instr(c1,'yyy')-instr(c1,'xxx')-3) z2,
12 rtrim(substr(c1,instr(c1,'yyy')+3),'x}') z3
13 from y;
Z1 Z2 Z3
--------------- --------------- ---------------
BASICINFO F 100
CONTACT 12345 20202
Here is another solution, which is derived from the place you left. Your query had already resulted into splitting of a row to 2 row. Below will make it in 3 columns:
WITH x
AS (SELECT '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' a
FROM DUAL),
-- Your query result here
tbl
AS ( SELECT REGEXP_SUBSTR (a,
'({.*?x})',
1,
ROWNUM,
NULL,
1)
Col
FROM x
CONNECT BY ROWNUM <= REGEXP_COUNT (a, 'x}'))
--- Actual Query
SELECT col,
REGEXP_SUBSTR (col,
'(.*?{)([^x]+)',
1,
1,
'',
2)
AS COL1,
REGEXP_SUBSTR (REGEXP_SUBSTR (col,
'(.*?)([^x]+)',
1,
2,
'',
2),
'[^y]+',
1,
1)
AS COL2,
REGEXP_SUBSTR (REGEXP_SUBSTR (col,
'[^y]+x',
1,
2),
'[^x]+',
1,
1)
AS COL3
FROM tbl;
Output:
SQL> /
COL COL1 COL2 COL3
------------------------------------------------ ------------------------------------------------ ------------------------------------------------ ------------------------------------------------
{BASICINFOxxxFyyy100x} BASICINFO F 100
{CONTACTxxx12345yyy20202x} CONTACT 12345 20202
i have table test2.it contains
ID
1
4
5
10
now i found missing numbers in this sequence.with this query
SELECT min_ID - 1 + level mn FROM
( SELECT MIN(ID) min_ID , MAX(ID) max_ID FROM test2 )
CONNECT BY level <= max_ID - min_ID + 1 minus SELECT ID FROM test2
output is:
MN
---
2
3
6
7
8
9
now i want to combine these 2 columns.I am unable to do this please help me.
i want output like
1 2
4 3
7 5
10 6
8
9
Oracle Setup:
CREATE TABLE test2 (id) AS
SELECT 1 FROM DUAL UNION ALL
SELECT 4 FROM DUAL UNION ALL
SELECT 5 FROM DUAL UNION ALL
SELECT 10 FROM DUAL;
Query:
WITH bounds ( mn, mx ) AS (
SELECT MIN( id ), MAX( id ) FROM test2
),
missing (id, rn) AS (
SELECT id, ROWNUM
FROM (
SELECT mn + LEVEL AS id
FROM bounds
CONNECT BY LEVEL < MX - MN
MINUS
SELECT id
FROM test2
)
),
existing ( id, rn ) AS (
SELECT id, ROWNUM
FROM test2
)
SELECT e.id, m.id
FROM existing e
FULL OUTER JOIN
missing m
ON ( e.rn = m.rn );
Output
ID ID
---------- ----------
1 2
4 3
5 6
10 7
9
8