REGEXP_REPLACE and REGEXP_EXTRACT - regexp-replace

I have a URI column coming in a log. I have to parse it and remove the certain parts from it and store it in a table. For Example if I have /v7/cp/members/~PERF1SP826T90869AN/options, then I have to store it as /v7/cp/members/*/options. Can I do that using REGEXP_REPLACE?
Also I would like to see if I could store that part that I removed from the URI as another column?
For Example from /v7/cp/members/~PERF1SP826T90869AN/options, I should store /v7/cp/members/*/options as a column and PERF1SP826T90869AN in a separate column.

If you are using Oracle, here's a method:
SQL> with tbl(str) as (
select '/v7/cp/members/~PERF1SP826T90869AN/options' from dual
)
select regexp_replace(str, '(.*?)(/|$)', '*/', 1, 5) as replaced,
regexp_substr(str, '(.*?)(/|$)', 1, 5, NULL, 1) as fifth_element
from tbl;
REPLACED FIFTH_ELEMENT
------------------------ -------------------
/v7/cp/members/*/options ~PERF1SP826T90869AN
SQL>

Related

In oracle sql queries - Is it valid to use % in between a search string

For example
select * from tbl where msg like ‘%\<CDT\>5000%\<DBT\>1000%’
msg: <TXN1><CDT>5000<\CDT><\TXN1><something else><TXN2><DBT>1000<\DBT><\TXN2>
I am looking to extract column values, if it has CDT as 5000 and DBT as 1000
Title question is:
is it valid to use % in between a search string, for example
where msg like ‘%5000%1000%’
Yes, it is valid.
What would CDT and DBT be? Extract which column values?
I'm not sure but I think what you want is to take the value of everything between and
The following code does that
with raw_text(t) as( select '<TXN1><CDT>5000</CDT></TXN1>' from dual)
SELECT *
FROM raw_text
CROSS JOIN
XMLTABLE (
'//CDT'
PASSING XMLTYPE (raw_text.t)
COLUMNS CDT VARCHAR2 (1000) PATH './text()')
Lets assume that you have valid XML data (a single root element, matching opening and closing tags and using /, and not \, for the closing tags). For example:
CREATE TABLE tbl (id, msg) AS
SELECT 1, '<ROOT><TXN1><CDT>5000</CDT></TXN1><something /><TXN2><DBT>1000</DBT></TXN2></ROOT>' FROM DUAL UNION ALL
SELECT 2, '<ROOT><TXN1><CDT>50000</CDT></TXN1><something /><TXN2><DBT>1000</DBT></TXN2></ROOT>' FROM DUAL UNION ALL
SELECT 3, '<ROOT><TXN1><CDT>5000</CDT></TXN1><something /><TXN2><DBT>10000</DBT></TXN2></ROOT>' FROM DUAL UNION ALL
SELECT 4, '<ROOT><TXN2><DBT>1000</DBT></TXN2><TXN1><CDT>5000</CDT></TXN1><something /></ROOT>' FROM DUAL UNION ALL
SELECT 5, '<ROOT><TXN1><CDT note="match me too">5000</CDT></TXN1><something /><TXN2><DBT>1000</DBT></TXN2></ROOT>' FROM DUAL;
Where ids 1, 4 and 5 all have values 5000 and 1000 but id 4 has the order of the transactions reversed in the XML (which is completely valid XML and does not change the data at all) and id 5 is the same as id 1 but with an added attribute on the CDT element. ids 2 has CDT of 50000 instead of 5000 and 3 has DBT of 10000 instead of 1000.
Then, yes, you could use:
SELECT *
FROM tbl
WHERE msg LIKE '%<CDT>5000%<DBT>1000%'
But it would return rows 1, 2 and 3 and would not match row 4 or 5. This is probably not what you want.
You could eliminate rows 2 and 3 by matching the end tags as well:
SELECT *
FROM tbl
WHERE msg LIKE '%<CDT>5000</CDT>%<DBT>1000</DBT>%'
But that still does not match when the tags are reversed or when there are additional attributes.
If you want to match the values then you can use XMLEXISTS:
SELECT *
FROM tbl
WHERE XMLEXISTS('//CDT[text()=5000]' PASSING XMLTYPE(msg))
AND XMLEXISTS('//DBT[text()=1000]' PASSING XMLTYPE(msg))
Which outputs:
ID
MSG
1
<ROOT><TXN1><CDT>5000</CDT></TXN1><something /><TXN2><DBT>1000</DBT></TXN2></ROOT>
4
<ROOT><TXN2><DBT>1000</DBT></TXN2><TXN1><CDT>5000</CDT></TXN1><something /></ROOT>
5
<ROOT><TXN1><CDT note="match me too">5000</CDT></TXN1><something /><TXN2><DBT>1000</DBT></TXN2></ROOT>
db<>fiddle here

How to chunk a string in pl sql using regexp?

I have a string as follows: ABCAPP9 Xore-Done-1. I want to chunk the string to get 4 elements separately at a given time in pl sql. Pls tell me the 4 different queries to get the following 4 results separately. Thanks
ABCAPP9
Xore
Done
1
REGEXP_SUBSTR ('ABCAPP9 Xore-Done-1', '[^[:space:]-]+', 1, n)
will give you the n-th part. Change n with the number you want. Here are all:
SELECT REGEXP_SUBSTR ('ABCAPP9 Xore-Done-1', '[^[:space:]-]+', 1, LEVEL)
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('ABCAPP9 Xore-Done-1', '[^[:space:]-]+', 1, LEVEL) IS NOT NULL
This should be a comment really to #Mottor but due to no formatting in comments I need to make it here.
A word of warning. As long as all elements of your string will be present and the delimiters can NEVER be next to each other you will be ok. However, the regex format of '[^<delimiter>]+' commonly used for parsing strings will not return the correct value if there is a NULL element in the list! See this post for proof: https://stackoverflow.com/a/31464699/2543416. To test in your example, remove the substring "Xore", leaving the space and hyphen next to each other:
SQL> SELECT REGEXP_SUBSTR ('ABCAPP9 -Done-1', '[^[:space:]-]+', 1, LEVEL)
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('ABCAPP9 -Done-1', '[^[:space:]-]+', 1, LEVEL) IS NOT NULL;
REGEXP_SUBSTR('
---------------
ABCAPP9
Done
1
The 2nd element should be NULL, but "Done" is returned instead! Not good if the position is important.
Use this format instead to handle NULLs and return the correct string element in the correct position (shown here with "Xore" removed and thus a NULL returned in that position to prove it handles the NULL):
SQL> with tbl(str) as (
select 'ABCAPP9 -Done-1' from dual
)
select regexp_substr(str, '(.*?)( |-|$)', 1, level, NULL, 1)
from tbl
connect by regexp_substr(str, '(.*?)( |-|$)', 1, level) is not null;
REGEXP_SUBSTR(S
---------------
ABCAPP9
Done
1
SQL>
I shudder to think of all the bad data being returned out there.
So user2153047, if you are still with me, for your need if you want the 3rd element (and handle the NULL) you would use:
SQL> select regexp_substr('ABCAPP9 -Done-1', '(.*?)( |-|$)', 1, 3, NULL, 1) "3rd"
from dual;
3rd
----
Done

pl-sql include column names in query

A weird request maybe but. My boss wants me to create an admin version of a page we have that displays data from an oracle query in a table.
The admin page, instead of displaying the data (query returns 1 row), needs to return the table name and column name
Ex: Instead of:
Name Initial
==================
Bob A
I want:
Name Initial
============================
Users.FirstName Users.MiddleInitial
I realize I can do this in code but would rather just modify the query to return the data I want so I can leave the report generation code mostly alone.
I don't want to do it in a stored procedure.
So when I spit out the data in the report using something like:
blah blah = MyDataRow("FirstName")
I can leave that as is but instead of it displaying "BOB" it would display "Users.FirstName"
And I want to do the query using select * if possible instead of listing all the columns
So for each of the columns I am querying in the * , I want to get (instead of the column value) the tablename.ColumnName or tablename|columnName
hope you are following- I am confusing myself...
pseudo:
select tablename + '.' + Columnname as WhateverTheColumnNameIs
from Table1
left join Table2 on whatever...
Join Table_Names on blah blah
Whew- after writing all this I think I will just do it on the code side.
But if you are up for it maybe a fun challenge
Oracle does not provide an authentic way(there is no pseudocolumn) to get the column name of a table as a result of a query against that table. But you might consider these two approaches:
Extract column name from an xmltype, formed by passing cursor expression(your query) in the xmltable() function:
-- your table
with t1(first_name, middle_name) as(
select 1,2 from dual
), -- your query
t2 as(
select * -- col1 as "t1.col1"
--, col2 as "t1.col2"
--, col3 as "t1.col3"
from hr.t1
)
select *
from ( select q.object_value.getrootelement() as col_name
, rownum as rn
from xmltable('//*'
passing xmltype(cursor(select * from t2 where rownum = 1))
) q
where q.object_value.getrootelement() not in ('ROWSET', 'ROW')
)
pivot(
max(col_name) for rn in (1 as "name", 2 as "initial")
)
Result:
name initial
--------------- ---------------
FIRST_NAME MIDDLE_NAME
Note: In order for column names to be prefixed with table name, you need to list them
explicitly in the select list of a query and supply an alias, manually.
PL/SQL approach. Starting from Oracle 11g you could use dbms_sql() package and describe_columns() procedure specifically to get the name of columns in the cursor(your select).
This might be what you are looking for, try selecting from system views USER_TAB_COLS or ALL_TAB_COLS.

PL/SQL query IN comma deliminated string

I am developing an application in Oracle APEX. I have a string with user id's that is comma deliminated which looks like this,
45,4932,20,19
This string is stored as
:P5_USER_ID_LIST
I want a query that will find all users that are within this list my query looks like this
SELECT * FROM users u WHERE u.user_id IN (:P5_USER_ID_LIST);
I keep getting an Oracle error: Invalid number. If I however hard code the string into the query it works. Like this:
SELECT * FROM users u WHERE u.user_id IN (45,4932,20,19);
Anyone know why this might be an issue?
A bind variable binds a value, in this case the string '45,4932,20,19'. You could use dynamic SQL and concatenation as suggested by Randy, but you would need to be very careful that the user is not able to modify this value, otherwise you have a SQL Injection issue.
A safer route would be to put the IDs into an Apex collection in a PL/SQL process:
declare
array apex_application_global.vc_arr2;
begin
array := apex_util.string_to_table (:P5_USER_ID_LIST, ',');
apex_collection.create_or_truncate_collection ('P5_ID_COLL');
apex_collection.add_members ('P5_ID_COLL', array);
end;
Then change your query to:
SELECT * FROM users u WHERE u.user_id IN
(SELECT c001 FROM apex_collections
WHERE collection_name = 'P5_ID_COLL')
An easier solution is to use instr:
SELECT * FROM users u
WHERE instr(',' || :P5_USER_ID_LIST ||',' ,',' || u.user_id|| ',', 1) !=0;
tricks:
',' || :P5_USER_ID_LIST ||','
to make your string ,45,4932,20,19,
',' || u.user_id|| ','
to have i.e. ,32, and avoid to select the 32 being in ,4932,
I have faced this situation several times and here is what i've used:
SELECT *
FROM users u
WHERE ','||to_char(:P5_USER_ID_LIST)||',' like '%,'||to_char(u.user_id)||',%'
ive used the like operator but you must be a little carefull of one aspect here: your item P5_USER_ID_LIST must be ",45,4932,20,19," so that like will compare with an exact number "',45,'".
When using it like this, the select will not mistake lets say : 5 with 15, 155, 55.
Try it out and let me know how it goes;)
Cheers ,
Alex
Create a native query rather than using "createQuery/createNamedQuery"
The reason this is an issue is that you cannot just bind an in list the way you want, and just about everyone makes this mistake at least once as they are learning Oracle (and probably SQL!).
When you bind the string '32,64,128', it effectively becomes a query like:
select ...
from t
where t.c1 in ('32,64,128')
To Oracle this is totally different to:
select ...
from t
where t.c1 in (32,64,128)
The first example has a single string value in the in list and the second has a 3 numbers in the in list. The reason you get an invalid number error is because Oracle attempts to cast the string '32,64,128' into a number, which it cannot do due to the commas in the string.
A variation of this "how do I bind an in list" question has come up on here quite a few times recently.
Generically, and without resorting to any PLSQL, worrying about SQL Injection or not binding the query correctly, you can use this trick:
with bound_inlist
as
(
select
substr(txt,
instr (txt, ',', 1, level ) + 1,
instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 )
as token
from (select ','||:txt||',' txt from dual)
connect by level <= length(:txt)-length(replace(:txt,',',''))+1
)
select *
from bound_inlist a, users u
where a.token = u.id;
If possible the best idea may be to not store your user ids in csv! Put them in a table or failing that an array etc. You cannot bind a csv field as a number.
Please dont use: WHERE ','||to_char(:P5_USER_ID_LIST)||',' like '%,'||to_char(u.user_id)||',%' because you'll force a full table scan although with the users table you may not have that many so the impact will be low but against other tables in an enterprise environment this is a problem.
EDIT: I have put together a script to demonstrate the differences between the regex method and the wildcard like method. Not only is regex faster but it's also a lot more robust.
-- Create table
create table CSV_TEST
(
NUM NUMBER not null,
STR VARCHAR2(20)
);
create sequence csv_test_seq;
begin
for j in 1..10 loop
for i in 1..500000 loop
insert into csv_test( num, str ) values ( csv_test_seq.nextval, to_char( csv_test_seq.nextval ));
end loop;
commit;
end loop;
end;
/
-- Create/Recreate primary, unique and foreign key constraints
alter table CSV_TEST
add constraint CSV_TEST_PK primary key (NUM)
using index ;
alter table CSV_TEST
add constraint CSV_TEST_FK unique (STR)
using index;
select sysdate from dual;
select *
from csv_test t
where t.num in ( Select Regexp_Substr('100001, 100002, 100003 , 100004, 100005','[^,]+', 1, Level) From Dual
Connect By Regexp_Substr('100001, 100002,100003, 100004, 100005', '[^,]+', 1, Level) Is Not Null);
select sysdate from dual;
select *
from csv_test t
where ('%,' || '100001,100002, 100003, 100004 ,100005' || ',%') like '%,' || num || ',%';
select sysdate from dual;
select *
from csv_test t
where t.num in ( Select Regexp_Substr('100001, 100002, 100003 , 100004, 100005','[^,]+', 1, Level) From Dual
Connect By Regexp_Substr('100001, 100002,100003, 100004, 100005', '[^,]+', 1, Level) Is Not Null);
select sysdate from dual;
select *
from csv_test t
where ('%,' || '100001,100002, 100003, 100004 ,100005' || ',%') like '%,' || num || ',%';
select sysdate from dual;
drop table csv_test;
drop sequence csv_test_seq;
Solution from Tony Andrews works for me. The process should be added to "Page processing" >> "After submit">> "Processes".
As you are Storing User Ids as String so You can Easily match String Using Like as Below
SELECT * FROM users u WHERE u.user_id LIKE '%'||(:P5_USER_ID_LIST)||'%'
For Example
:P5_USER_ID_LIST = 45,4932,20,19
Your Query Surely Will return Any of 1 User Id which Matches to Users table
This Will Surely Resolve Your Issue , Enjoy
you will need to run this as dynamic SQL.
create the entire string, then run it dynamically.

How do I display a field's hidden characters in the result of a query in Oracle?

I have two rows that have a varchar column that are different according to a Java .equals(). I can't easily change or debug the Java code that's running against this particular database but I do have access to do queries directly against the database using SQLDeveloper. The fields look the same to me (they are street addresses with two lines separated by some new line or carriage feed/new line combo).
Is there a way to see all of the hidden characters as the result of a query?I'd like to avoid having to use the ascii() function with substr() on each of the rows to figure out which hidden character is different.
I'd also accept some query that shows me which character is the first difference between the two fields.
Try
select dump(column_name) from table
More information is in the documentation.
As for finding the position where the character differs, this might give you an idea:
create table tq84_compare (
id number,
col varchar2(20)
);
insert into tq84_compare values (1, 'hello world');
insert into tq84_compare values (2, 'hello' || chr(9) || 'world');
with c as (
select
(select col from tq84_compare where id = 1) col1,
(select col from tq84_compare where id = 2) col2
from
dual
),
l as (
select
level l from dual
start with 1=1
connect by level < (select length(c.col1) from c)
)
select
max(l.l) + 1position
from c,l
where substr(c.col1,1,l.l) = substr(c.col2,1,l.l);
SELECT DUMP('€ÁÑ', 1016)
FROM DUAL
... will print something like:
Typ=96 Len=3 CharacterSet=WE8MSWIN1252: 80,c1,d1

Resources