How to chunk a string in pl sql using regexp? - oracle

I have a string as follows: ABCAPP9 Xore-Done-1. I want to chunk the string to get 4 elements separately at a given time in pl sql. Pls tell me the 4 different queries to get the following 4 results separately. Thanks
ABCAPP9
Xore
Done
1

REGEXP_SUBSTR ('ABCAPP9 Xore-Done-1', '[^[:space:]-]+', 1, n)
will give you the n-th part. Change n with the number you want. Here are all:
SELECT REGEXP_SUBSTR ('ABCAPP9 Xore-Done-1', '[^[:space:]-]+', 1, LEVEL)
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('ABCAPP9 Xore-Done-1', '[^[:space:]-]+', 1, LEVEL) IS NOT NULL

This should be a comment really to #Mottor but due to no formatting in comments I need to make it here.
A word of warning. As long as all elements of your string will be present and the delimiters can NEVER be next to each other you will be ok. However, the regex format of '[^<delimiter>]+' commonly used for parsing strings will not return the correct value if there is a NULL element in the list! See this post for proof: https://stackoverflow.com/a/31464699/2543416. To test in your example, remove the substring "Xore", leaving the space and hyphen next to each other:
SQL> SELECT REGEXP_SUBSTR ('ABCAPP9 -Done-1', '[^[:space:]-]+', 1, LEVEL)
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('ABCAPP9 -Done-1', '[^[:space:]-]+', 1, LEVEL) IS NOT NULL;
REGEXP_SUBSTR('
---------------
ABCAPP9
Done
1
The 2nd element should be NULL, but "Done" is returned instead! Not good if the position is important.
Use this format instead to handle NULLs and return the correct string element in the correct position (shown here with "Xore" removed and thus a NULL returned in that position to prove it handles the NULL):
SQL> with tbl(str) as (
select 'ABCAPP9 -Done-1' from dual
)
select regexp_substr(str, '(.*?)( |-|$)', 1, level, NULL, 1)
from tbl
connect by regexp_substr(str, '(.*?)( |-|$)', 1, level) is not null;
REGEXP_SUBSTR(S
---------------
ABCAPP9
Done
1
SQL>
I shudder to think of all the bad data being returned out there.
So user2153047, if you are still with me, for your need if you want the 3rd element (and handle the NULL) you would use:
SQL> select regexp_substr('ABCAPP9 -Done-1', '(.*?)( |-|$)', 1, 3, NULL, 1) "3rd"
from dual;
3rd
----
Done

Related

Oracle: is it possible to trim a string and count the number of occurances, and insert to a new table?

My source table looks like this:
id|value|count
Value is a String of values separated by semicolons(;). For example it may look like this
A;B;C;D;
Some may not have values at a certain position, like this
A;;;D;
First, I've selectively moved records to a new table(targettable) based on positions with values using regexp. I achieved this by using [^;]+; for having some value between the semicolons, and [^;]*; for those positions I don't care about. For example, if I wanted the 1st and 4th place to have values, I could incorporate regexp with insert into like this
insert into
targettable tt (id, value, count)
SELECT some_seq.nextval,value, count
FROM source table
WHERE
regexp_like(value, '^[^;]+;[^;]*;[^;]*;[^;]+;')
so now my new table has a list of records that have values at the 1st and 4th position. It may look like this
1|A;B;C;D;|2
2|B;;;E;|1
3|A;D;;D|3
Next there are 2 things I want to do. 1. get rid of values other than 1st and 4th. 2.combine identical values and add up their count. For example, record 1 and 3 are the same, so I want to trim so they become A;D;, and then add their count, so 2+3=5. Now my new table looks like this
1|A;D;|5
2|B;E;|1
As long as I can somehow get to the final table from source table, I don't care about the steps. The intermediate table is not required, but it may help me achieve the final result. I'm not sure if I can go any further with Orcale though. If not, I'll have to move and process the records with Java. Bear in mind I have millions of records, so I would consider the Oracle method if it is possible.
You should be able to skip the intermediate table; just extract the 1st and 4th elements, using the regexp_substr() function, while checking that those are not null:
select regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) -- first position
|| ';' || regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) -- fourth position
|| ';' as value, -- if you want trailing semicolon
count
from source
where regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) is not null
and regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) is not null;
VALUE COUNT
------------------ ----------
A;D; 2
B;E; 1
A;D; 3
and then aggregate those results:
select value, sum(count) as count
from (
select regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) -- first position
|| ';' || regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) -- fourth position
|| ';' as value, -- if you want trailing semicolon
count
from source
where regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) is not null
and regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) is not null
)
group by value;
VALUE COUNT
------------------ ----------
A;D; 5
B;E; 1
Then for your insert you can use that query, either with an auto-increment ID (12c+), or setting an ID from a sequence via a trigger, or possibly wrapped in another level of subquery to get the value explicitly:
insert into target (id, value, count)
select some_seq.nextval, value, count
from (
select value, sum(count) as count
from (
select regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) -- first position
|| ';' || regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) -- fourth position
|| ';' as value, -- if you want trailing semicolon
count
from source
where regexp_substr(value, '(.*?)(;|$)', 1, 1, null, 1) is not null
and regexp_substr(value, '(.*?)(;|$)', 1, 4, null, 1) is not null
)
group by value
);
If you're creating a new sequence to do that, so they start from 1, you can use rownum or row_number() instead.
Incidentally, using a keyword or a function name like count as a column name is confusing (sum(count) !?); those might not be your real names though.
I would use regexp_replace to remove the 2nd and 3rd parts of the string, combined with an aggregate query to get the total count, like :
SELECT
regexp_replace(value, '^[^;]+;([^;]*;[^;]*;)[^;]+;', ''),
SUM(count)
FROM source table
WHERE
regexp_like(value, '^[^;]+;[^;]*;[^;]*;[^;]+;')
GROUP BY
regexp_replace(value, '^[^;]+;([^;]*;[^;]*;)[^;]+;', '')

REGEXP_REPLACE and REGEXP_EXTRACT

I have a URI column coming in a log. I have to parse it and remove the certain parts from it and store it in a table. For Example if I have /v7/cp/members/~PERF1SP826T90869AN/options, then I have to store it as /v7/cp/members/*/options. Can I do that using REGEXP_REPLACE?
Also I would like to see if I could store that part that I removed from the URI as another column?
For Example from /v7/cp/members/~PERF1SP826T90869AN/options, I should store /v7/cp/members/*/options as a column and PERF1SP826T90869AN in a separate column.
If you are using Oracle, here's a method:
SQL> with tbl(str) as (
select '/v7/cp/members/~PERF1SP826T90869AN/options' from dual
)
select regexp_replace(str, '(.*?)(/|$)', '*/', 1, 5) as replaced,
regexp_substr(str, '(.*?)(/|$)', 1, 5, NULL, 1) as fifth_element
from tbl;
REPLACED FIFTH_ELEMENT
------------------------ -------------------
/v7/cp/members/*/options ~PERF1SP826T90869AN
SQL>

pl/sql save hierarchical query to variable

I am trying to save result set of hierarchical query to variable
CREATE OR REPLACE FUNCTION test12
RETURN number IS
result number(4):=0;
clli_array dbms_sql.varchar2_table;
BEGIN
with tmp as (select 'strforregexp' str from dual)
select regexp_substr(str, '\/([A-Z0-9]{11}|[A-Z0-9]{8})', 1, level) STR into :clli_array from tmp
connect by regexp_substr(str, '\/([A-Z0-9]{11}|[A-Z0-9]{8})', 1, level) is not null;
END test12;
But getting an error
Error(9,9): PLS-00049: bad bind variable 'CLLI_ARRAY'
So, i have 2 questions
1) can i get all matches of regexp witohout hierarchical query
2) why i am getting an error
As #APC pointed out, the first problem is that you've got a colon in front of CLLI_ARRAY. This causes the PL/SQL compiler to believe that CLLI_ARRAY is going to be a SQL*Plus substitution variable, and when it finds that such a variable is not defined it throws the error you got.
However, even if you remove the colon you're not out of the woods yet. Once you remove the colon you'll get
PLS-00597: expression 'CLLI_ARRAY' in the INTO list is of wrong type
That's because CLLI_ARRAY is a PL/SQL-type collection, but your statement returns a single string.
What you probably want to do is to use BULK COLLECT to have the system retrieve all the results of the query into your VARCHAR2_TABLE:
with tmp as (select 'strforregexp' str from dual)
select regexp_substr(str, '\/([A-Z0-9]{11}|[A-Z0-9]{8})', 1, level) STR
BULK COLLECT into clli_array
from tmp
connect by regexp_substr(str, '\/([A-Z0-9]{11}|[A-Z0-9]{8})', 1, level) is not null
Best of luck.

Oracle invalid number in clause

I'm struggling with getting a query to work, and I could really use some help. We have an in house app that we use for building small web apps where I work. It's basically a drag and drop GUI. There's functionality built in to access query string values using the key.
I'm passing a comma separated list of values into a page through the query string. I'm then trying to use the list of values as part of an in clause in a query.
I can see that the value is correct in the query string.
orders=1,2,3
Here's the specific part of the query
"AND OrderNumber IN (this is where it maps from the query string)
I've tried running similar queries in Toad, and I think I've found the issue. It's giving an invalid number error, and I think it's wrapping the query string value in single quotes. I can replicate the error when I do "AND OrderNumber IN ('1,2,3')" in Toad.
Here's where I get really confused. The following works in Toad.
"AND OrderNumber IN ('1','2','3')"
So I tried recreating that by doing
select replace('1,2,3', ',', chr(39)||','||chr(39)) from dual;
I have confirmed that returns '1','2','3' in Toad.
However, I still get an Invalid Number error when I run the following in Toad.
AND OrderNumber IN (replace('1,2,3', ',', chr(39)||','||chr(39))
I've been racking my brain over this, and I can't figure it out. It seems to me that if "AND OrderNumber IN ('1','2','3')" works, and replace('1,2,3', ',', chr(39)||','||chr(39)) returns '1','2','3', that "AND OrderNumber IN (replace('1,2,3', ',', chr(39)||','||chr(39))" should work.
Any help you might be able to offer on this would be greatly appreciated. I know the rest of the query works. That's why I didn't post it. I'm just stuck on trying to get this IN clause working.
A change to phonetic_man's answer that will allow for NULL elements in the list. The regex format of '[^,]+' for parsing delimited lists does not handle NULL list elements and will return an incorrect value if one exists and thus its use should be avoided. Change the original by deleting the number 2 for instance and see the results. You will get a '3' in the 2nd element's position! Here's a way that handles the NULL and returns the correct value for the element:
SELECT TRIM(REGEXP_SUBSTR(str, '(.*?)(,|$)', 1, LEVEL, NULL, 1)) str
FROM ( SELECT '1,,3,4' str FROM dual )
connect by level <= regexp_count(str, ',') + 1;
See here for more info and proof: https://stackoverflow.com/a/31464699/2543416
Can you try the following query.
SELECT * FROM orders
WHERE orderno IN
(
SELECT TRIM(REGEXP_SUBSTR(str, '[^,]+', 1, LEVEL)) str
FROM ( SELECT '1,2,3,4' str FROM dual )
CONNECT BY INSTR(str, ',', 1, LEVEL - 1) > 0
)
The inline query splitting the string in different rows. So, on executing it you will get the following result.
SELECT trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM ( SELECT '1,2,3,4' str FROM dual )
CONNECT BY instr(str, ',', 1, LEVEL - 1) > 0
1
2
3
4
Now, passing this result to the main query IN clause should work.
I think the desired clause to be built is:
AND OrderNumber IN (1,2,3)
A numeric list. The example you tested:
AND OrderNumber IN ('1','2','3')
works because an implicit conversion from a VARCHAR2 to a NUMBER is occurring for each element in the list.
The following clause will not work because no implicit conversion of the string '1,2,3' can be made (note the clause has a single string element):
AND OrderNumber IN ('1,2,3')
When doing a replace, you are converting the single string: 1,2,3 with the single string: 1','2','3 and this single string cannot be implicitly converted to a number.

Is it possible to check whether a value is in the list of items in a single Oracle Decode Function?

I would like to know that if I can compare a value with a list of items in an decode function. Basically I want to know that if is it possible to make a decode statement's 'search' value a list. For example,
decode(task_id, (1,2,3), 3 * task_time)
This piece of code won't compile though. Is this the only option for this case then (without using case-when) or are there alternative ways of doing this?
decode(task_id, 1, 3 * task_time,
2, 3 * task_time,
3, 3 * task_time)
I am using Oracle 10gR2. Any help is much appreciated.
If a single list of values is sufficient, you can turn it into a CASE and IN clause:
case when task_id in (1, 2, 3) then 3 * task_time else null end
I don't think its possible to use a list with decode in this way. Per the docs:
DECODE compares expr to each search value one by one. If expr is equal
to a search, then Oracle Database returns the corresponding result. If
no match is found, then Oracle returns default
So task_id is compared with a search value one by one. If search value was a list, you couldn't compare with a single value.
I found a solution :)
select
decode(
task_id,
(select task_id from dual where task_id in (1,2,3)),
3*task_time)
decode ( (taskid-1)*(taskid-2)*(taskid-3), 0, 3 * tasktime ) could do what you want
Here's a working example:
with a as (
select 1 taskid, 11 tasktime from dual union all
select 2 taskid, 11 tasktime from dual union all
select 3 taskid, 11 tasktime from dual union all
select 4 taskid, 11 tasktime from dual
)
select
taskid,
decode (
(taskid-1) *
(taskid-2) *
(taskid-3) ,
0, 3 * tasktime
) decoded
from a;
you can use union all:
select 3 * task_time from your_table where task_id in (1,2,3)
union all
select task_time from your_table where task_id not in (1,2,3)
but why ?
In if condition :
IF vVal in (3,1,2) THEN
dbms_output.put_line('FOUND');
ELSE
dbms_output.put_line('NOT FOUND');
END IF;
I've seen instr(...) and static strings used as a way to quickly determine whether a value is one of multiple you're looking for as a condition for what value you return. You may need to choose a delimiter, but with limited datasets you can even omit it. It avoids using case-when, subqueries, and PL/SQL. As far as I know, there is no shorter way to do this:
decode(instr('123', taskid), 0, null, taskid * 3)
It's also very convenient when you want to set exceptions (for instance returning taskid without multiplication if it equals 1):
decode(instr('12345', taskid), 0, null, 1, taskid, taskid * 3)

Resources