How to Shorted Regular Expression - oracle

I have a RegExp as below, when I use it in Oracle SQL, I got ORA-12723 error, how can I let it in the shortest format?
WITH test_data ( str ) AS (
SELECT 'This is extension 1234, here is mobile phone: 090-1234-5678 maybe 8+24-98765432. Then +1-(234)-090-345 also 86 21-4566-4556' AS str FROM DUAL
)
SELECT TRIM(
TRAILING ',' FROM
REGEXP_REPLACE(
str,
'.*?(\+?\d{1,11}[-,\+]\d{1,11}[-,\+]\d{1,11}[-,\+]\d{1,11}[-,\+]\d{1,11}[-,\+]\d{3,11}|\+?\d{1,11}[-,\+]\d{1,11}[-,\+]\d{1,11}[-,\+]\d{1,11}[-,\+]\d{3,11}|\+?\d{1,11}[-,\+]\d{1,11}[-,\+]\d{1,11}[-,\+]\d{3,11}|\+?\d{1,11}[-,\+]\d{1,11}[-,\+]\d{3,11}|\+?\d{1,11}[-,\+]\d{3,11}|\d{3,11}|$)',
'\1,'
)
) AS replaced_str
FROM test_data
The result what I wonder as below:
1234,090-1234-5678,8+24-98765432,+1-(234)-090-345,86 21-4566-4556

Consider this approach. This uses CONNECT BY to traverse the string and parse it into elements that are separated by a space or the end of the line. Then for each element, remove non-digit characters ('\D'). Lastly use LISTAGG() to put the elements back into one comma delimited string.
WITH test_data(str) AS (
SELECT 'Txa233g141b Ta233141 Ta233142 Ta233147zz Ta233xx148zz' AS str FROM DUAL
)
SELECT listagg(regexp_replace(regexp_substr(str, '(.*?)( |$)', 1, level, null, 1), '\D'), ',')
within group (order by str) replaced_str
FROM test_data
connect by level <= regexp_count(str, ' ') + 1;
REPLACED_STR
--------------------------------------------------------------------------------
233141,233141,233142,233147,233148
1 row selected.

Related

In Oracle SQL, how to retrieve the first, second words, From a group of words using regular expressions

Using regular expression i would like retrieve the second word and the first letter of the second word from a group of words I was able to retrieve the first word,and the first letter of the first word, i would like to do it for the second one.
Select regexp_substr('Red blue Green', '[A-z]*') "First word", regexp_substr ( 'Red blue Green', '[A-Z]' ) "the first letter of first Word" FROM DUAL;
Create table table_x(
str VARCHAR2(500)
);
INSERT into table_x (str) VALUES ('red blue green');
INSERT into table_x (str) VALUES ('one two three four');
Here is a generic solution with bind variables.
WITH got_wordn AS
(
SELECT REGEXP_SUBSTR ( str
, '\S+'
, 1
, &word_num
) AS wordn
FROM table_x
)
SELECT wordn
, SUBSTR (wordn, &char_num, 1) AS charx
FROM got_wordn
;
This solution I hardcoded the VALUES you were looking for. 1st character in second word.
WITH got_wordn AS
(
SELECT REGEXP_SUBSTR ( str
, '\S+'
, 1
, 2
) AS wordn
FROM table_x
)
SELECT wordn
, SUBSTR (wordn, 1, 1) AS charx
FROM got_wordn

How to get character or string after nth occurrence of pipeline '|' symbol in ORACLE using REGULAR_EXPRESSION?

What is the regular expression query to get character or string after nth occurrence of pipeline | symbol in ORACLE? For example I have two strings as follows,
Jack|Sparrow|17-09-16|DY7009|Address at some where|details
|Jack|Sparrow|17-09-16||Address at some where|details
I want 'DY7009' which is after 3rd pipeline symbol starting from 1st position, So what will be regular expression query for this? And in second string suppose that 1st position having | symbol, then I want 4th string if there is no value then it should give NULL or BLANK value.
select regexp_substr('Jack|Sparrow|17-09-16|DY7009|Address at some where|details'
,' ?? --REX Exp-- ?? ') as col
from dual;
Result - DY7009
select regexp_substr('Jack|Sparrow|17-09-16|DY7009|Address at some where|details'
,' ?? --REX Exp-- ?? ') as col
from dual;
Result - '' or (i.e. NULL)
So what should be the regexp? Please help. Thank you in Advance
NEW UPDATE Edit ---
Thank you all guys!!, I appreciate your answer!!. I think, I didn't ask question right. I just want a regular expression to get 'string/character string' after nth occurrence of pipeline symbol. I don't want to replace any string so only regexp_substr will do the job.
----> If 'Jack|Sparrow|SQY778|17JULY17||00J1' is a string
I want to find string value after 2nd pipe line symbol here the answer will be SQY778. If i want to find string after 3rd pipeline symbol then answer will be 17JULY17. And if I want to find value after 4th pipeline symbol then it should give BLANK or NULL value because there is nothing after 4th pipeline symbol. If I want to find string 5th symbol then I will only replace one digit in Regular expression i.e. 5 and I will get 00J1 as a result.
Here ya go. Replace the 4th argument to regexp_substr() with the number of the field you want.
with tbl(str) as (
select 'Jack|Sparrow|17-09-16|DY7009|Address at some where|details ' from dual
)
select regexp_substr(str, '(.*?)(\||$)', 1, 4, NULL, 1) field_4
from tbl;
FIELD_4
--------
DY7009
SQL>
To list all the fields:
with tbl(str) as (
select 'Jack|Sparrow|17-09-16|DY7009|Address at some where|details ' from dual
)
select regexp_substr(str, '(.*?)(\||$)', 1, level, NULL, 1) split
from tbl
connect by level <= regexp_count(str, '\|')+1;
SPLIT
-------------------------
Jack
Sparrow
17-09-16
DY7009
Address at some where
details
6 rows selected.
SQL>
So if you want select fields you could use:
with tbl(str) as (
select 'Jack|Sparrow|17-09-16|DY7009|Address at some where|details ' from dual
)
select
regexp_substr(str, '(.*?)(\||$)', 1, 1, NULL, 1) first,
regexp_substr(str, '(.*?)(\||$)', 1, 2, NULL, 1) second,
regexp_substr(str, '(.*?)(\||$)', 1, 3, NULL, 1) third,
regexp_substr(str, '(.*?)(\||$)', 1, 4, NULL, 1) fourth
from tbl;
Note this regex handles NULL elements and will still return the correct value. Some of the other answers use the form '[^|]+' for parsing the string but this fails when there is a NULL element and should be avoided. See here for proof: https://stackoverflow.com/a/31464699/2543416
Don't have enough reputation to comment on Chris Johnson's answer so adding my own. Chris has the correct approach of using back-references but forgot to escape the Pipe character.
The regex will look like this.
WITH dat
AS (SELECT 'Jack|Sparrow|17-09-16|DY7009|Address at some where|details' AS str,
3 AS pos
FROM DUAL
UNION
SELECT ' |Jack|Sparrow|17-09-16||Address at some where|details' AS str,
4 AS pos
FROM DUAL)
SELECT str,
pos,
REGEXP_REPLACE (str, '^([^\|]*\|){' || pos || '}([^\|]*)\|.*$', '\2')
AS regex_result
FROM dat;
I'm creating the regex dynamically by adding the position of the Pipe character dynamically.
The result looks like this.
|Jack|Sparrow|17-09-16||Address at some where|details (4):
Jack|Sparrow|17-09-16|DY7009|Address at some where|details (3): DY7009
You can use regex_replace to get the nth matching group. In your example, the fourth match could be retrieved like this:
select regexp_replace(
'Jack|Sparrow|17-09-16|DY7009|Address at some where|details',
'^([^\|]*\|){3}([^\|]*)\|.*$',
'\4'
) as col
from dual;
Edit: Thanks Arijit Kanrar for pointing out the missing escape characters.
To OP: regex_replace doesn't replace anything in the database, only in the returned string.
You can use this query to get the value at the specific column ( nth occurrence ) as follows
SELECT nth_string
FROM
(SELECT TRIM (REGEXP_SUBSTR (long_string, '[^|]+', 1, ROWNUM) ) nth_string ,
level AS lvl
FROM
(SELECT REPLACE('Jack|Sparrow|17-09-16|DY7009|Address at some where|details','||','| |') long_string
FROM DUAL
)
CONNECT BY LEVEL <= REGEXP_COUNT ( long_string, '[^|]+')
)
WHERE lvl = 4;
Note that i am using the standard query in oracle to split a delimited string into records. To handle blank between delimiters as in your second case, i am replacing it with a space ' ' . The space gets converted to NULL after applying TRIM() function.
You can get any nth record by replacing the number in lvl = at the end of the query.
Let me know your feedback. Thanks.
EDIT:
It seems to not work with purely regexp_substr() as there is no way to convert blank between '||' to Oracle NULL .So intermediate TRIM() was required and i am adding a replace to make it easier. There will be patterns to directly match this scenario, but could not find them.
Here are all scenarios for 4th occurence .
WITH t
AS (SELECT '|Jack|Sparrow|SQY778|17JULY17||00J1' long_string
FROM dual
UNION ALL
SELECT 'Jack|Sparrow|SQY778|17JULY17||00J1' long_string
FROM dual
UNION ALL
SELECT '||Jack|Sparrow|SQY778|17JULY17|00J1' long_string
FROM dual)
SELECT long_string,
Trim (Regexp_substr (mod_string, '\|([^|]+)', 1, 4, NULL, 1)) nth_string
FROM (SELECT long_string,
Replace(long_string, '||', '| |') mod_string
FROM t) ;
LONG_STRING NTH_STRING
------------------------ -----------
|Jack|Sparrow|SQY778|17JULY17||00J1 17JULY17
Jack|Sparrow|SQY778|17JULY17||00J1 NULL
||Jack|Sparrow|SQY778|17JULY17|00J1 SQY778
EDIT2: Finally a pattern that gives the solution.Thanks to Gary_W
To get the nth occurence from the string , use:
WITH t
AS (SELECT '|Jack|Sparrow|SQY778|17JULY17||00J1' long_string
FROM dual
UNION ALL
SELECT 'Jack|Sparrow|SQY778|17JULY17||00J1' long_string
FROM dual
UNION ALL
SELECT '||Jack|Sparrow|SQY778|17JULY17|00J1' long_string
FROM dual)
SELECT long_string,
Trim (regexp_substr (long_string, '(.*?)(\||$)', 1, :n + 1, NULL, 1)) nth_string
FROM t;

Oracle regexp_substr to get first portion of a string

I am stuck here. I am using oracle and I want to get the first part of a string before the first appearance of '|'. This is my query but it returns the last part i.e 25.0. I want it to return first part i.e 53. How do I achieve that?
select regexp_substr('53|100382951130|25.0', '[^|]+$', 1,1) as part1 from dual
Assuming you always have at least one occurrence of '|', you can use the following, with no regexp:
with test(string) as ( select '53|100382951130|25.0' from dual)
select substr(string, 1, instr(string, '|')-1)
from test
You could even use regexp to achieve the same thing, or even handle the case in which you have no '|':
with test(string) as (
select '53|100382951130|25.0' from dual union all
select '1234567' from dual)
select string,
substr(string, 1, instr(string, '|')-1),
regexp_substr(string, '[^|]*')
from test
You can even handle the case with no occurrence of '|' without regexp:
with test(string) as (
select '53|100382951130|25.0' from dual union all
select '1234567' from dual)
select string,
substr(string, 1, instr(string, '|')-1),
regexp_substr(string, '[^|]*'),
substr(string, 1,
case
when instr(string, '|') = 0
then length(string)
else
instr(string, '|')-1
end
)
from test

stored procedure parameter of CSV input to return all records

I have the following Oracle stored procedure that takes on a string of CSV of user ID's which would return the list of users to the output cursor which works fine:
create or replace PROCEDURE GET_USERS_BY_IDS
(
v_cur OUT sys_refcursor
,v_userIdsCsv IN varchar2 DEFAULT ''
) AS
BEGIN
open v_cur for
with userIds
as
(
select
trim( substr (txt,
instr (txt, ',', 1, level ) + 1,
instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 ) )
as token
from (select ','||v_userIdsCsv||',' txt
from dual)
connect by level <=
length(v_userIdsCsv)-length(replace(v_userIdsCsv,',',''))+1
)
select
id
,lastname
,firstname
from
users
where
id in (select * from userIds);
END GET_USERS_BY_IDS;
so by doing exec GET_USERS_BY_IDS(:cur1, '123,456') I can get users of IDs of 123 and 456. However I would like to return ALL users if I pass in an empty string, i.e. exec GET_USERS_BY_IDS(:cur1, '') would return all users. What do I have to change in the sproc code to accomplish that? Thanks.
Consider this solution using REGEXP functions which I feel simplifies things. I also incorporated the test from my comment as well. Note the REGEXP handles a NULL list element too:
create or replace PROCEDURE GET_USERS_BY_IDS
(
v_cur OUT sys_refcursor
,v_userIdsCsv IN varchar2 DEFAULT '1'
) AS
BEGIN
open v_cur for
with userIds
as
(
select trim( regexp_substr(v_userIdsCsv, '(.*?)(,|$)', 1, level, NULL, 1) ) as token
from dual
connect by level <= regexp_count(v_userIdsCsv, ',') + 1
)
select
id
,lastname
,firstname
from
users
where v_userIdsCsv = '1' -- Empty list returns all users
OR id in (select * from userIds);
END GET_USERS_BY_IDS;
Its untested so let us know what happens if you test it.
Do you mean, something as simple as
BEGIN
if v_userIdsCsv = '' then
open v_cur for select id, lastname, firstname from users
else (rest of your code)
end if;
?
OK, with the confirmation in comments...
It seems you should be able to change the WHERE condition at the very end:
where
v_userIdsCsv = '' or id in (select * from userIds);
Outer join between user and user_ids. And clever where condition.
Has it helped?
with csv as (select '321,333' aa from dual)
,userIds
as
(
select
trim( substr (txt,
instr (txt, ',', 1, level ) + 1,
instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 ) )
as token
from (select ','||(select aa from csv )||',' txt
from dual)
connect by level <=
length((select aa from csv ))-length(replace((select aa from csv),',',''))+1
)
select
user_id
,username
from
all_users a
left join userIds b on a.user_id = b.token
where nvl2((select aa from csv),b.token,a.user_id) = a.user_id
I think I found a more efficient way to do this now. In the where statement, I can just short-circuit it if the input parameter is a blank:
where
v_userIdsCsv = '' or
id in (select * from userIds);

Converting a string (delimited by consecutive delimiters) to rows using Oracle regular expression

I have a generic string delimited by consecutive delimiter tilde (~) in Oracle. For e.g. the string is 'apple~orange~~mango~~grapes'. It need to be converted into rows but one important thing to be noticed is that the separator is consecutive tilde not single tilde. The output should be like below:
apple~orange
mango
grapes
The workaround is already done using instr and substr oracle functions but I need more cleaner solution using Oracle regular expressins. I have tried using below query but not gettig the correct solution:
WITH str AS (SELECT 'apple~orange~~mango~~grapes' str FROM dual),
cnt AS (SELECT LEVEL sno FROM dual CONNECT BY LEVEL < 5)
SELECT regexp_substr (str, '[^~]+', 1, sno) FROM str CROSS JOIN cnt;
Try this (you can use any character like *, ^, # etc other than , if you expect any of your string will contain , as actual value):
WITH STR AS (SELECT REPLACE('apple~orange~~mango~~grapes','~~',',') STR FROM DUAL),
CNT AS (SELECT LEVEL SNO FROM DUAL CONNECT BY LEVEL < 4)
SELECT REGEXP_SUBSTR (STR, '[^,]+', 1, SNO) FROM STR CROSS JOIN CNT;
You can also use XML with oracle
WITH CTE AS (SELECT '"'
|| REPLACE('apple~orange~~mango~~grapes','~~','","')
|| '"' STR FROM DUAL)
select column_value str from cte, xmltable(str);
Try this,
select
t.str
, regexp_substr (t.str, '[^~(?=~)]+', 1, rn) spl
from YOURTABLE t
cross
join (select rownum rn
from (select max (length (regexp_replace (t.str, '[^~~]+'))) + 1 mx
from YOURTABLE t
)
connect by level <= mx
)
where regexp_substr (t.str,'[^~(?=~)]+' , 1, rn) is not null)

Resources